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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 



1. CROSS REFERENCE TO RELATED APPLICATIONS 

This application claims Ihe priority benefit of U.S. Provisional Application Serial No. 
5 60/4 16,186 filed October 2, 2002 entitled "Novel Nucleic Acids and Polypeptides" , which 
contains material previously disclosed in the following applications: U.S. Application Serial 
No. 10/084,643 filed February 26, 2002 entitled "Novel Nucleic Acids and Polypeptides", 
Attorney Docket No. 21272-502; PCT Application Serial No. PCT/US00/35017 filed 
December 22, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney 

1 0 Docket No. 784CTJP3 A/PCT; PCT Application Serial No. PCT/US0 1/02623 filed January 25, 
2001 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 
785CIP3/PCT; PCT Application Serial No. PCT/US0 1/03 800 filed February 5, 2001 entitled 
"Novel Contigs Obtained from Various Libraries", Attorney Docket No. 787CEP3/PCT; PCT 
Application Serial No. PCT/US0 1/04927 filed February 26, 2001 entitled "Novel Contigs 

1 5 Obtained from Various Libraries", Attorney Docket No. 788CIP3/PCT; PCT Application 
Serial No. PCT/US0 1/04941 filed March 5, 2001 entitled "Novel Contigs Obtained from 
Various Libraries", Attorney Docket No. 789CEP3/PCT; PCT Application Serial No. 
PCT/US01/08631 filed March 30, 2001 entitled "Novel Contigs Obtained from Various 
Libraries", Attorney Docket No. 790CIP3/PCT; PCT Application Serial No. 

20 PCT/US0 1/08656 filed April 1 8, 200 1 entitled "Novel Contigs Obtained from Various 
Libraries", Attorney Docket No. 791CIP3/PCT; all of which are incorporated herein by 
reference in their entirety. 



2. BACKGROUND OF THE INVENTION 

25 

2.1 TECHNICAL FEELD 

The present invention provides novel polynucleotides and proteins encoded by such 
polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

30 

2.2 BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such 
as lymphokines, interferons, circulating soluble factors, chemokines, and interleukins) has 
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matured rapidly over the past decade. The now routine hybridization cloning and expression 
cloning techniques clone novel polynucleotides "directly" in the sense that they rely on 
information directly related to the discovered protein (i.e., partial DNA/amino acid sequence 
of the protein in the case of hybridization cloning; activity of the protein in the case of 

5 expression cloning). More recent "indirect" cloning techniques such as signal sequence 
cloning, which isolates DNA sequences based on the presence of a now well-recognized 
secretory leader sequence motif, as well as various PCR-based or low stringency 
hybridization-based cloning techniques, have advanced the state of the art by making 
available large numbers of DNA/amino acid sequences for proteins that are known to have 

1 0 biological activity, for example, by virtue of their secreted nature in the case of leader 
sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, 
for example, diagnostics, forensics, gene mapping; identification of mutations responsible 

15 for genetic disorders or other traits, to assess biodiversity, and to produce many other types 
of data and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
20 isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
cloned genes or degenerate variants thereof; especially naturally occurring variants such as 
allelic variants, antisense polynucleotide molecules, and antibodies that specifically recognize 
one or more epitopes present on such polypeptides, as well as hybridomas producing such 
antibodies. 

25 The compositions of the present invention additionally include vectors, including 

expression vectors, containing the polynucleotides of the invention, cells genetically engineered 
to contain such polynucleotides and cells genetically engineered to express such 
polynucleotides. 

The present invention relates to a collection or library of at least one novel nucleic acid 
30 sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public 
databases. The invention relates also to the proteins encoded by such polynucleotides, along 
with therapeutic, diagnostic and research utilities for these polynucleotides and proteins. These 
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nucleic acid sequences are designated as SEQ ID NO: 1-684, or 1369-1966 and are provided in 
the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenine; C is 
cytosine; G is guanine; T is thymine; and N is any of the four bases or unknown. In the amino 
acids provided in the Sequence Listing, an asterisk (*) corresponds to the stop codon. 
5 The nucleic acid sequences of the present invention also include, nucleic acid sequences 

that hybridize to the complement of SEQ ID NO: 1-684, or 1369-1966 under stringent 
hybridization conditions; nucleic acid sequences which are allelic variants or species 
homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 
encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ 
10 ID NO: 1-684, or 1369-1966. A polynucleotide comprising a nucleotide sequence having at 
least 90% identity to an identifying sequence of SEQ ID NO: 1-684, or 1369-1966 or a 
degenerate variant or fragment thereof. The identifying sequence can be 100 base pairs in 
length. 

The nucleic acid sequences of the present invention also include the sequence 

1 5 information from the nucleic acid sequences of SEQ ID NO: 1-684, or 1369-1966. The 

sequence information can be a segment of any one of SEQ ID NO: 1-684, or 1369-1966 that 
uniquely identifies or represents the sequence information of SEQ ID NO: 1-684, or 1369-1966. 

A collection as used in this application can be a collection of only one polynucleotide. 
The collection of sequence information or identifying information of each sequence can be 

20 provided on a nucleic acid array. In one embodiment, segments of sequence information are 
provided on a nucleic acid array to detect the polynucleotide that contains the segment. The 
array can be designed to detect full-match or mismatch to the polynucleotide that contains the 
segment The collection can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 

25 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; 
and host cells or organisms transformed with these expression vectors. Nucleic acid sequences 
(or their reverse or direct complements) according to the invention have numerous applications 
in a variety of techniques known to those skilled in the art of molecular biology, such as use as 
hybridization probes, use as primers for PCR, use in an array, use in computer-readable media, 

30 use in sequencing full-length genes, use for chromosome and gene mapping, use in the 

recombinant production of protein, and use in the generation of anti-sense DNA or RNA, their 
chemical analogs and the like. 
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In a preferred embodiment, the nucleic acid sequences* of SEQ ID NO: 1-684, or 1369- 
1966 or novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art In a particularly preferred embodiment, the 
nucleic acid sequences of SEQ ID NO: 1-684, or 1369-1966 or novel segments or parts of the 
5 nucleic acids provided herein are used in diagnostics for identifying expressed genes or, as well 
known in the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed 
sequence tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1-684, 

10 or 1369-1966; a polynucleotide comprising any of the foil length protein coding sequences of 
SEQ ID NO: 1-684, or 1369-1966; and a polynucleotide comprising any of the nucleotide 
sequences of the mature protein coding sequences of SEQ ID NO: 1-684, or 1369-1966. The 
polynucleotides of the present invention also include, but are not limited to, a polynucleotide 
that hybridizes under stringent hybridization conditions to (a) the complement of any one of the 

15 nucleotide sequences set forth in SEQ ID NO: 1-684, or 1369-1966; (b) a nucleotide sequence 
encoding any one of the amino acid sequences set forth in SEQ ID NO: 1-684, or 1369-1966; 
(c) a polynucleotide which is an allelic variant of any polynucleotides recited above; (d) a 
polynucleotide which encodes a species homologue (e.g. orthologs) of any of the proteins 
recited above; or (e) a polynucleotide that encodes a polypeptide comprising a specific domain 

20 or truncation of any of the polypeptides comprising an amino acid sequence set forth in SEQ ID 
NO: 685-1368, or 1967-2564, or Tables 3A, 3B, 5, 7, or 8. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the 
corresponding foil length or mature protein. Polypeptides of the invention also include 

25 polypeptides with biological activity that are encoded by (a) any of the polynucleotides having 
a nucleotide sequence set forth in SEQ ID NO: 1-684, or 1369-1966; or (b) polynucleotides that 
hybridize to the complement of the polynucleotides of (a) under stringent hybridization 
conditions. Biologically active variants of any of the polypeptide sequences in the Sequence 
Listing, and "substantial equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 

30 85%, 90%, 95%, 98% or 99% amino acid sequence identity) that preferably retain biological 
activity are also contemplated. The polypeptides of the invention may be wholly or partially 
chemically synthesized but are preferably produced by recombinant means using the genetically 
engineered cells (e.g. host cells) of the invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such 
as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a 
5 polynucleotide of the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the 
polypeptide from the culture or from the host cells. Preferred embodiments include those in 
10 which the protein produced by such processes is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology. These techniques 
include use as hybridization probes, use as oligomers, or primers, for PCR, use for 
chromosome and gene mapping, use in the recombinant production of protein, and use in 
1 5 generation of anti-sense DNA or RNA, their chemical analogs and the like. For example, 
when the expression of an mRNA is largely restricted to a particular cell or tissue type, 
polynucleotides of the invention can be used as hybridization probes to detect the presence 
of the particular cell or tissue mRNA in a sample using, e.g. t in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
20 expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a 
25 polypeptide of the invention can be used to generate an antibody that specifically binds the 
polypeptide. Such antibodies, particularly monoclonal antibodies, are useful for detecting or 
quantitating the polypeptide in tissue. The polypeptides of the invention can also be used as 
molecular weight markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical 
30 condition which comprises the step of administering to a mammalian subject a 

therapeutically effective amount of a composition comprising a polypeptide of the present 
invention and a pharmaceutically acceptable carrier. 
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In particular, the polypeptides and polynucleotides of the invention can be utilized, 
for example, in methods for the prevention and/or treatment of disorders involving aberrant 
protein expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
5 polynucleotides or polypeptides of the invention in a sample. Such methods can, for 

example, be utilized as part of prognostic and diagnostic evaluation of disorders as recited 
herein and for the identification of subjects exhibiting a predisposition to such conditions. 
The invention provides a method for detecting the polynucleotides of the invention in a 
sample, comprising contacting the sample with a compound that binds to and forms a 

10 complex with the polynucleotide of interest for a period sufficient to form the complex and 
under conditions sufficient to form a complex and detecting the complex such that if a 
complex is detected, the polynucleotide of interest is detected. The invention also provides a 
method for detecting the polypeptides of the invention in a sample comprising contacting the 
sample with a compound that binds to and forms a complex with the polypeptide under 

1 5 conditions and for a period sufficient to form the complex and detecting the formation of the 
complex such that if a complex is formed, the polypeptide is detected. 

The invention also provides kits comprising polynucleotide probes and/or 
monoclonal antibodies, and optionally quantitative standards, for carrying out methods of the 
invention. Furthermore, the invention provides methods for evaluating the efficacy of drugs, 

20 and monitoring the progress of patients, involved in clinical trials for the treatment of 
disorders as recited above. 

The invention also provides methods for the identification of compounds that 
modulate (i.e., increase or decrease) the expression or activity of the polynucleotides and/or 
polypeptides of the invention. Such methods can be utilized, for example, for the 

25 identification of compounds that can ameliorate symptoms of disorders as recited herein. 
Such methods can include, but are not limited to, assays for identifying compounds and 
other substances that interact with (e.g., bind to) the polypeptides of the invention. The 
invention provides a method for identifying a compound that binds to the polypeptides of the 
invention comprising contacting the compound with a polypeptide of the invention in a cell 

30 for a time sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and detecting the complex by detecting 
the reporter gene sequence expression such that if expression of the reporter gene is detected 
the compound that binds to a polypeptide of the invention is identified. 
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The methods of the invention also provide methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals 
exhibiting symptoms or tendencies. In addition, the invention encompasses methods for 
treating diseases or disorders as recited herein comprising administering compounds and 
5 other substances that modulate the overall activity of the target gene products. Compounds 
and other substances can affect such modulation either on the level of target gene/protein 
expression or target protein activity. 

The polypeptides of the present invention and the polynucleotides encoding them are 
also useful for the same functions known to one of skill in the art as the polypeptides and 
1 0 polynucleotides to which they have homology (set forth in Tables 2 A and 2B); for which 
they have a signature region (as set forth in Tables 3A and 3B); or for which they have 
homology to a gene family (as set forth in Tables 4A and 4B). If no homology is set forth 
for a sequence, then the polypeptides and polynucleotides of the present invention are useful 
for a variety of applications, as described herein, including use in arrays for detection. 

15 

4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms 
20 "a", "an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
25 Likewise "immunologically active" or "immunological activity" refers to the capability of 
the natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are 
engaged in extracellular or intracellular membrane trafficking, including the export of 
30 secretory or enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded 
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molecules may be "partial" such that only certain portion(s) of the nucleic acids bind or it 
may be "complete" such that total complementarity exists between the single stranded 
molecules. The degree of complementarity between the nucleic acid strands has significant 
effects on the efficiency and strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ ceils. The term "germ 
line stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a 
steady and continuous source of germ cells for the production of gametes. The term 
"primordial germ cells (PGCs)" refers to a small population of ceils set aside from other cell 
lineages particularly from the yolk sac, mesenteries, or gonadal ridges during embryogenesis 
that have the potential to differentiate into germ cells and other cells. PGCs are the source 
from which GSCs and ES cells are derived. The PGCs, the GSCs and the ES cells are 
capable of self-renewal. Thus these cells not only populate the germ line and give rise to a 
plurality of terminally differentiated cells that comprise the adult specialized organs, but are 
able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides 
which modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. 
EMFs include, but are not limited to, promoters, and promoter modulating sequences 
(inducible elements). One class of EMFs are nucleic acid fragments which induce the 
expression of an operably linked ORF in response to a specific regulatory factor or 
physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonucleotide" are used interchangeably and refer to a heteropolymer of nucleotides or 
the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or 
synthetic origin which may be single-stranded or double-stranded and may represent the 
sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like 
material. In the sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and 
N is A, C, G, or T (U) or unknown. It is contemplated that where the polynucleotide is 
RNA, the T (thymine) in the sequences provided herein is substituted with U (uracil). 
Generally, nucleic acid segments provided by this invention may be assembled from 
fragments of the genome and short oligonucleotide linkers, or from a series of 
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oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid which is 
capable of being expressed in a recombinant transcriptional unit comprising regulatory 
elements derived from a microbial or viral operon, or a eukaiyotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
5 "segment" or "probe" or "primer" are used interchangeably and refer to a sequence of 
nucleotide residues which are at least about 5 nucleotides, more preferably at least about 7 
nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 1 1 
nucleotides and most preferably at least about 17 nucleotides. The fragment is preferably 
less than about 500 nucleotides, preferably less than about 200 nucleotides, more preferably 

10 less than about 100 nucleotides, more preferably less than about 50 nucleotides and most 
preferably less than 30 nucleotides. Preferably the probe is from about 6 nucleotides to 
about 200 nucleotides, preferably from about 15 to about 50 nucleotides, more preferably 
from about 17 to 30 nucleotides and most preferably from about 20 to 25 nucleotides. 
Preferably the fragments can be used in polymerase chain reaction (PCR), various 

1 5 hybridization procedures or microarray procedures to identify or amplify identical or related 
parts of mRNA or DNA molecules. A fragment or segment may uniquely identify each 
polynucleotide sequence of the present invention. Preferably the fragment comprises a 
sequence substantially similar to any one of SEQ ED NO: 1-684, or 1369-1966. 

Probes may, for example, be used to determine whether specific mRNA molecules 

20 are present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal 
DNA as described by Walsh et al. (Walsh, P.S. et al, 1992, PCR Methods Appl 1:241-250). 
They may be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods 
well known in the art. Probes of the present invention, their preparation and/or labeling are 
elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold 

25 Spring Harbor Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in 

Molecular Biology, John Wiley & Sons, New York NY, both of which are incorporated 
herein by reference in their entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1-684, or 1369-1966. The 

30 sequence information can be a segment of any one of SEQ ID NO: 1-684, or 1369-1966 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO: 
1-684, or 1369-1966, or those segments identified in Tables 3A, 3B, 5, 7, or 8. One such 
segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 
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mer is folly matched in the human genome is 1 in 300. In the human genome, there are three 
billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers exist, there 
are 300 times more twenty-mers than there are base pairs in a set of human chromosomes. 
Using the same analysis, the probability for a seventeen-mer to be fully matched in the 
5 human genome is approximately 1 in 5. When these segments are used in arrays for 

expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 
folly matched in the expressed sequences is also approximately one in five because 
expressed sequences comprise less than approximately 5% of the entire genome sequence. 
Similarly, when using sequence information for detecting a single mismatch, a segment 

10 can be a twenty-five mer. The probability that the twenty-five mer would appear in a human 
genome with a single mismatch is calculated by multiplying the probability for a foil match 
(l-s-4 25 ) times the increased probability for mismatch at each nucleotide position (3 x 25). The 
probability that an eighteen mer with a single mismatch can be detected in an array for 
expression studies is approximately one in five. The probability that a twenty-mer with a single 

1 5 mismatch can be detected in a human genome is approximately one in five. 

The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related 
nucleic acid sequences. For example, a promoter is operably associated or operably linked 

20 with a coding sequence if the promoter controls the transcription of the coding sequence. 
While operably linked nucleic acid sequences can be contiguous and in the same reading 
frame, certain genetic elements e.g. repressor genes are not contiguously linked to the coding 
sequence but still control transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number 

25 of differentiated cell types that are present in an adult organism. A pluripotent cell is 
restricted in its differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an 
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to naturally 
occurring or synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a 

30 stretch of amino acid residues of at least about 5 amino acids, preferably at least about 7 

amino acids, more preferably at least about 9 amino acids and most preferably at least about 
17 or more amino acids. The peptide preferably is not greater than about 200 amino acids, 
more preferably less than 150 amino acids and most preferably less than 100 amino acids. 
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Preferably the peptide is from about 5 to about 200 amino acids. To be active, any 
polypeptide must have sufficient length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells 
that have not been genetically engineered and specifically contemplates various polypeptides 
5 arising from post-translational modifications of the polypeptide including, but not limited to, 
acetylation, carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the 
full-length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a 
1 0 peptide or protein without a signal or leader sequence. The "mature protein portion" means 
that portion of the protein which does not include a signal or leader sequence. The peptide 
may have been produced by processing in the cell which removes any leader/signal 
sequence. The mature protein portion may or may not include the initial methionine residue. 
The methionine residue may be removed from the protein during processing in the cell. The 
1 5 peptide may be produced synthetically or the protein may have been produced using a 
polynucleotide only encoding for the mature protein coding sequence. 

The term "derivative" refers to polypeptides chemically modified by such techniques 
as ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
20 substitution by chemical synthesis of amino acids such as ornithine, which do not normally 
occur in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, 
e g. y recombinant DNA techniques. Guidance in determining which amino acid residues 
25 may be replaced, added or deleted without abolishing activities of interest, may be found by 
comparing the sequence of the particular polypeptide with that of homologous peptides and 
minimizing the number of amino acid sequence changes made in regions of high homology 
(conserved regions) or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may 
30 be synthesized or selected by making use of the "redundancy" in the genetic code. Various 
codon substitutions, such as the silent changes which produce various restriction sites, may 
be introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be 
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reflected in the polypeptide or domains of other peptides added to the polypeptide to modify 
the properties of any part of the polypeptide, to change characteristics such as ligand-binding 
affinities, interchain affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
5 another amino acid having similar structural and/or chemical properties, i.e., conservative 
amino acid replacements. "Conservative" amino acid substitutions may be made on the 
basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the 
amphipathic nature of the residues involved. For example, nonpolar (hydrophobic) amino 
acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and 
10 methionine; polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, 
asparagine, and glutamine; positively charged (basic) amino acids include arginine, lysine, 
and histidine; and negatively charged (acidic) amino acids include aspartic acid and glutamic 
acid. "Insertions" or "deletions" are preferably in the range of about 1 to 20 amino acids, 
more preferably 1 to 10 amino acids. The variation allowed may be experimentally 
1 5 determined by systematically making insertions, deletions, or substitutions of amino acids in 
a polypeptide molecule using recombinant DNA techniques and assaying the resulting 
recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such 
20 alterations can, for example, alter one or more of the biological functions or biochemical 
characteristics of the polypeptides of the invention. For example, such alterations may 
change polypeptide characteristics such as ligand-binding affinities, interchain affinities, or 
degradation/turnover rate. Further, such alterations can be selected so as to generate 
polypeptides that are better suited for expression, scale up and the like in the host cells 
25 chosen for expression. For example, cysteine residues can be deleted or substituted with 
another amino acid residue in order to eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the 
indicated nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
30 polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, 

more preferably at least 99% by weight, of the indicated biological macromolecules present 
(but water, buffers, and other small molecules, especially molecules having a molecular 
weight of less than 1000 daltons, can be present). 
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The term "isolated" as used herein refers to a nucleic acid or polypeptide separated 
from at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic 
acid or polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide 
is found in the presence of (if anything) only a solvent, buffer, ion, or other component 
5 normally present in a solution of the same. The terms "isolated" and "purified" do not 
encompass nucleic acids or polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or 
mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins 

10 made in bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant 
microbial" defines a polypeptide or protein essentially free of native endogenous substances 
and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed 
in most bacterial cultures, e.g., E. colU will be free of glycosylation modifications; 
polypeptides or proteins expressed in yeast will have a glycosylation pattern in general 

15 different from those expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or 
virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression 
vehicle can comprise a transcriptional unit comprising an assembly of (1) a genetic element 
or elements having a regulatory role in gene expression, for example, promoters or 

20 enhancers, (2) a structural or coding sequence which is transcribed into mRNA and 

translated into protein, and (3) appropriate transcription initiation and termination sequences. 
Structural units intended for use in yeast or eukaryotic expression systems preferably include 
a leader sequence enabling extracellular secretion of translated protein by a host cell. 
Alternatively, where recombinant protein is expressed without a leader or transport 

25 sequence, it may include an amino terminal methionine residue. This residue may or may 
not be subsequently cleaved from the expressed recombinant protein to provide a final 
product. 

The term "recombinant expression system" means host cells which have stably 
integrated a recombinant transcriptional unit into chromosomal DNA or cany the 
30 recombinant transcriptional unit extrachromosomally. Recombinant expression systems as 
defined herein will express heterologous polypeptides or proteins upon induction of the 
regulatory elements linked to the DNA segment or synthetic gene to be expressed. This term 
also means host cells which have stably integrated a recombinant genetic element or 
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elements having a regulatory role in gene expression, for example, promoters or enhancers. 
Recombinant expression systems as defined herein will express polypeptides or proteins 
endogenous to the cell upon induction of the regulatory elements linked to the endogenous 
DNA segment or gene to be expressed. The cells can be prokaryotic or eukaryotic. 
5 The term "secreted" includes a protein that is transported across or through a 

membrane, including transport as a result of signal sequences in its amino acid sequence 
when it is expressed in a suitable host cell. "Secreted" proteins include without limitation 
proteins secreted wholly soluble proteins) or partially (e.g., receptors) from the cell in 
which they are expressed. "Secreted" proteins also include without limitation proteins that 

10 are transported across the membrane of the endoplasmic reticulum. "Secreted" proteins are 
also intended to include proteins containing non-typical signal sequences (e.g. Interleukin-1 
Beta, see Krasney, P.A. and Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors 
released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. 
(1998) Annu. Rev. Immunol. 16:27-55) 

1 5 Where desired, an expression vector may be designed to contain a "signal or leader 

sequence" which will direct the polypeptide through the membrane of a cell. Such a 
sequence may be naturally present on the polypeptides of the present invention or provided 
from heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in 

20 the art as stringent. Stringent conditions can include highly stringent conditions (i.e., 

hybridization to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 
raM EDTA at 65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent 
conditions (i.e., washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization 
conditions are described herein in the examples. 

25 In instances of hybridization of deoxyoligonucleotides, additional exemplary 

stringent hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate 
at 37°C (for 14-base oligonucleotides), 48°C (for 17-base oligonucleotides), 55°C (for 20- 
base oligonucleotides), and 60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" or "substantially similar" can refer both to 

30 nucleotide and amino acid sequences, for example a mutant sequence, that varies from a 
reference sequence by one or more substitutions, deletions, or additions, the net effect of 
which does not result in an adverse functional dissimilarity between the reference and 
subject sequences. Typically, such a substantially equivalent sequence varies from one of 
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those listed herein by no more than about 35% (ia f the number of individual residue 
substitutions, additions, and/or deletions in a substantially equivalent sequence, as compared 
to the corresponding reference sequence, divided by the total number of residues in the 
substantially equivalent sequence is about 0.35 or less). Such a sequence is said to have 
5 65% sequence identity to the listed sequence. In one embodiment, a substantially 

equivalent, e.g., mutant, sequence of the invention varies from a listed sequence by no more 
than 30% (70% sequence identity); in a variation of this embodiment, by no more than 25% 
(75% sequence identity); and in a further variation of this embodiment, by no more than 
20% (80% sequence identity) and in a further variation of this embodiment, by no more than 

10 10% (90% sequence identity) and in a further variation of this embodiment, by no more that 
5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid sequences 
according to the invention preferably have at least 80% sequence identity with a listed amino 
acid sequence, more preferably at least 85% sequence identity, more preferably at least 90% 
sequence identity, more preferably at least 95% sequence identity, more preferably at least 

1 5 98% sequence identity, and most preferably at least 99% sequence identity. Substantially 
equivalent nucleotide sequence of the invention can have lower percent sequence identities, 
taking into account, for example, the redundancy or degeneracy of the genetic code. 
Preferably, the nucleotide sequence has at least about 65% identity, more preferably at least 
about 75% identity, more preferably at least about 80% sequence identity, more preferably at 

20 least 85% sequence identity, more preferably at least 90% sequence identity, more preferably 
at least about 95% sequence identity, more preferably at least 98% sequence identity, and 
most preferably at least 99% sequence identity. For the purposes of the present invention, 
sequences having substantially equivalent biological activity and substantially equivalent 
expression characteristics are considered substantially equivalent For the purposes of 

25 determining equivalence, truncation of the mature sequence (e.g., via a mutation which 
creates a new stop codon) should be disregarded. Sequence identity may be determined, 
e.g., using the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). 
Identity between sequences can also be determined by other methods known in the art, e.g. 
by varying hybridization conditions. 

30 The term "totipotent" refers to the capability of a cell to differentiate into all of the 

ceU types of an adult organism. 

The term "transformation 11 means introducing DNA into a suitable host cell so that 
the DNA is replicable, either as an extrachromosomal element, or by chromosomal 
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integration. The term "transfection" refers to the taking up of an expression vector by a 
suitable host cell, whether or not any coding sequences are in fact expressed. The term 
"infection" refers to the introduction of nucleic acids into a suitable host cell by use of a 
virus or viral vector. 

5 As used herein, an "uptake modulating fragment," UMF, means a series of 

nucleotides which mediate the uptake of a linked DNA fragment into a cell. UMFs can be 
readily identified using known UMFs as a target sequence or target motif with the 
computer-based systems described below. The presence and activity of a UMF can be 
confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic acid 
10 molecule is then incubated with an appropriate host under appropriate conditions and the 
uptake of the marker sequence is determined. As described above, a UMF will increase the 
frequency of uptake of a linked marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless 
the context dictates otherwise. 

15 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising 
the nucleotide sequences of SEQ ID NO: 1-684, or 1369-1966; a polynucleotide encoding 

20 any one of the peptide sequences of SEQ ID NO: 1-684, or 1369-1966; and a polynucleotide 
comprising the nucleotide sequence encoding the mature protein coding sequence of the 
polynucleotides of any one of SEQ ID NO: 1-684, or 1369-1966. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID 

25 NO: 1-684, or 1369-1966; (b) nucleotide sequences encoding any one of the amino acid 
sequences set forth in the Sequence Listing, or Table 7; (c) a polynucleotide which is an 
allelic variant of any polynucleotide recited above; (d) a polynucleotide which encodes a 
species homologue of any of the proteins recited above; or (e) a polynucleotide that encodes 
a polypeptide comprising a specific domain or truncation of the polypeptides of SEQ ID NO: 

30 685-1368, or 1967-2564 (for example, as set forth in Tables 3 A, 3B, 5, 7, or 8). Domains of 
interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like 
polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, 
or combinations thereof; domains in immunoglobulin-like proteins include the variable 
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immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 

The polynucleotides of the invention include naturally occurring or wholly or 
5 partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The 

polynucleotides may include entire coding region of the cDNA or may represent a portion of 
the coding region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences 
disclosed herein. The corresponding genes can be isolated in accordance with known methods 

10 using the sequence information disclosed herein. Such methods include the preparation of 
probes or primers from the disclosed sequence information for identification and/or 
amplification of genes in appropriate genomic libraries or other sources of genomic materials. 
Further 5 f and 3 1 sequence can be obtained using methods known in the art. For example, full 
length cDNA or genomic DNA that corresponds to any of the polynucleotides of SEQ ID NO: 

15 1-684, or 1369-1966 can be obtained by screening appropriate cDNA or genomic DNA 

libraries under suitable hybridization conditions using any of the polynucleotides of SEQ ID 
NO: 1-684, or 1369-1966 or a portion thereof as a probe. Alternatively, the polynucleotides of 
SEQ ID NO: 1-684, or 1369-1966 may be used as the basis for suitable primer(s) that allow 
identification and/or amplification of genes in appropriate genomic DNA or cDNA libraries. 

20 The nucleic acid sequences of the invention can be assembled from ESTs and sequences 

(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence 
information, representative fragment or segment information, or novel segment information for 
the full-length gene. 

25 The polynucleotides of the invention also provide polynucleotides including 

nucleotide sequences that are substantially equivalent to the polynucleotides recited above. 
Polynucleotides according to the invention can have, e.g., at least about 65%, at least about 
70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least 
about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 91%, 92%, 93%, 94%, 

30 and even more typically at least about 95%, 96%, 97%, 98%, 99% sequence identity to a 
polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic 
acid sequence fragments that hybridize under stringent conditions to any of the nucleotide 
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sequences of SEQ ID NO: 1-684, or 1369-1966, or complements thereof, which fragment is 
greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 
nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 
nucleotides or more that are selective for (i.e. specifically hybridize to) any one of the 
5 polynucleotides of the invention are contemplated. Probes capable of specifically 

hybridizing to a polynucleotide can differentiate polynucleotide sequences of the invention 
from other polynucleotide sequences in the same family of genes or can differentiate human 
genes from genes of other species, and are preferably based on unique nucleotide sequences. 
The sequences falling within the scope of the present invention are not limited to these 

10 specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided in SEQ ID NO: 1- 
684, or 1369-1966, a representative fragment thereof, or a nucleotide sequence at least 90% 
identical, preferably 95% identical, to SEQ ID NO: 1-684, or 1369-1966 with a sequence from 
another isolate of the same species. Furthermore, to accommodate codon variability, the 

1 5 invention includes nucleic acid molecules coding for the same amino acid sequences as do the 
specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of 
one codon for another codon that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology results for the nucleic acids of the present invention, 
including SEQ ID NO: 1-684, or 1369-1966 can be obtained by searching a database using an 

20 algorithm or a program. Preferably, a BLAST (Basic Local Alignment Search Tool) program is 
used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21:403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using FASTXY algorithm may be performed. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are 

25 also provided by the present invention. Species homologs may be isolated and identified by 
making suitable probes or primers from the sequences provided herein and screening a 
suitable nucleic acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which 

30 also encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
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prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic 
acids encoding the amino acid sequence variants are preferably constructed by mutating the 
5 polynucleotide to encode an amino acid sequence that does not occur in nature. These 
nucleic acid alterations can be made at sites that differ in the nucleic acids from different 
species (variable positions) or in highly conserved regions (constant regions). Sites at such 
locations will typically be modified in series, e.g., by substituting first with conservative 
choices (e.g, hydrophobic amino acid to a different hydrophobic amino acid) and then with 
1 0 more distant choices (e.g. , hydrophobic amino acid to a charged amino acid), and then 
deletions or insertions may be made at the target site. Amino acid sequence deletions 
generally range from about 1 to 30 residues, preferably about 1 to 10 residues, and are 
typically contiguous. Amino acid insertions include amino- and/or carboxyl-terminal 
fusions ranging in length from one to one hundred or more residues, as well as intrasequence 
1 5 insertions of single or multiple amino acid residues. Intrasequence insertions may range 
generally from about 1 to 10 amino residues, preferably from 1 to 5 residues. Examples of 
terminal insertions include the heterologous signal sequences necessary for secretion or for 
intracellular targeting in different host cells and sequences such as FLAG or poly-histidine 
sequences useful for purifying the expressed protein. 
20 In a preferred method, polynucleotides encoding the novel amino acid sequences are 

changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter 
a polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of 
the site of being changed. In general, the techniques of site-directed mutagenesis are well 
25 known to those of skill in the art and this technique is exemplified by publications such as, 
Edelman et al., DNA 2:183 (1983). A versatile and efficient method for producing 
site-specific changes in a polynucleotide sequence was published by Zoller and Smith, 
Nucleic Acids Res. 10:6487-6500 (1982). PCR may also be used to create amino acid 
sequence variants of the novel nucleic acids. When small amounts of template DNA are 
30 used as starting material, primer(s) that differs slightly in sequence from the corresponding 
region in the template DNA can generate the desired amino acid variant. PCR amplification 
results in a population of product DNA fragments that differ from the polynucleotide 
template encoding the polypeptide at the position specified by the primer. The product DNA 
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fragments replace the corresponding region in the plasmid and this gives a polynucleotide 
encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et aL, Gene 34:315 (1985); and other mutagenesis techniques 
5 well known in the art, such as, for example, the techniques in Sambrook et al., supra, and 
Current Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of 
the genetic code, other DNA sequences which encode substantially the same or a 
functionally equivalent amino acid sequence may be used in the practice of the invention for 
the cloning and expression of these novel nucleic acids. Such DNA sequences include those 
1 0 which are capable of hybridizing to the appropriate novel nucleic acid sequence under 
stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention could be 
used to generate polynucleotides encoding chimeric or fusion proteins comprising one or 
more domains of the invention and heterologous protein sequences. 
1 5 The polynucleotides of the invention additionally include the complement of any of 

the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, 
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such 
polynucleotides are well known to those of skill in the art and can include, for example, 
methods for determining hybridization conditions that can routinely isolate polynucleotides 
20 of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-684, or 1369-1966, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that 
direct the expression of mat nucleic acid, or a functional equivalent thereof, in appropriate 
25 host cells. Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et 
al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). 
Useful nucleotide sequences for joining to polynucleotides include an assortment of vectors, 
30 e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well 
known in the art. Accordingly, the invention also provides a vector including a 
polynucleotide of the invention and a host cell containing the polynucleotide. In general, the 
vector contains an origin of replication functional in at least one organism, convenient 
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restriction endonuclease sites, and a selectable marker for the host cell. Vectors according to 
the invention include expression vectors, replication vectors, probe generation vectors, and 
sequencing vectors. A host cell according to the invention can be a prokaryotic or 
eukaiyotic cell and can be a unicellular organism or part of a multicellular organism. 
5 The present invention further provides recombinant constructs comprising a nucleic 

acid having any of the nucleotide sequences of SEQ ID NO: 1-684, or 1369-1966 or a 
fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or viral 
vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1- 

10 684, or 1369-1966 or a fragment thereof is inserted, in a forward or reverse orientation. In 
the case of a vector comprising one of the ORPs of the present invention, the vector may 
further comprise regulatory sequences, including for example, a promoter, operably linked to 
the ORF. Large numbers of suitable vectors and promoters are known to those of skill in the 
art and are commercially available for generating the recombinant constructs of the present 

15 invention. The following vectors are provided by way of example: Bacterial: pBs, 
phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a 
(Stratagene), P Trc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: 
pWLneo, pS V2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL 
(Pharmacia). 

20 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et aL, 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. 
Many suitable expression control sequences are known in the art. General methods of 
expressing recombinant proteins are also known and are exemplified in R. Kaufman, 

25 Methods in Enzymology 1 85, 537-566 (1990). As defined herein "operably linked" means 
that the isolated polynucleotide of the invention and an expression control sequence are 
situated within a vector or cell in such a way that the protein is expressed by a host cell 
which has been transformed (transfected) with the ligated polynucleotide/expression control 
sequence. 

30 Promoter regions can be selected from any desired gene using CAT 

(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 
appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include 
lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate 
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early, HSV thymidine kinase, early and late S V40, LTRs from retrovirus, and mouse 
metallothionein-I. Selection of the appropriate vector and promoter is well within the level 
of ordinary skill in the art. Generally, recombinant expression vectors will include origins of 
replication and selectable markers permitting transformation of the host cell, e.g., the 
5 ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived 
from a highly expressed gene to direct transcription of a downstream structural sequence. 
Such promoters can be derived from operons encoding glycolytic enzymes such as 3- 
phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among 
others. The heterologous structural sequence is assembled in appropriate phase with 

10 translation initiation and termination sequences, and preferably, a leader sequence capable of 
directing secretion of translated protein into the periplasmic space or extracellular medium. 
Optionally, the heterologous sequence can encode a fusion protein including an amino 
terminal identification peptide imparting desired characteristics, e.g., stabilization or 
simplified purification of expressed recombinant product. Useful expression vectors for 

1 5 bacterial use are constructed by inserting a structural DNA sequence encoding a desired 
protein together with suitable translation initiation and termination signals in operable 
reading phase with a functional promoter. The vector will comprise one or more phenotypic 
selectable markers and an origin of replication to ensure maintenance of the vector and to, if 
desirable, provide amplification within the host. Suitable prokaryotic hosts for 

20 transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Psendomonas, Streptomyces, and Staphylococcus, although others may 
also be employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial 
use can comprise a selectable marker and bacterial origin of replication derived from 

25 commercially available plasmids comprising genetic elements of the well known cloning 
vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
(Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, 
USA). These pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. Following transformation of a suitable host strain 

30 and growth of the host strain to an appropriate cell density, the selected promoter is induced 
or derepressed by appropriate means (e.g., temperature shift or chemical induction) and cells 
are cultured for an additional period. Cells are typically harvested by centrifugation, 



WO 2004/080148 PCT/US2003/030720 

23 

disrupted by physical or chemical means, and the resulting crude extract retained for further 
purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et aL, Nat. Biotech 17, 870-872 (1999), incorporated herein by 
5 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intra-muscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form 
of naked DNA. 

10 

4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules 
that are hybridizable to or complementary to the nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO: 1-684, or 1369-1966, or fragments, analogs or 

15 derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense 11 nucleic acid encoding a protein, eg., complementary to the 
coding strand of a double-stranded cDNA molecule or complementary to an mRNA 
sequence. In specific aspects, antisense nucleic acid molecules are provided that comprise a 
sequence complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an 

20 entire coding strand, or to only a portion thereof. Nucleic acid molecules encoding 

fragments, homologs, derivatives and analogs of a protein of any of SEQ ID NO: 1-684, or 
1369-1966 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID 
NO: 1-684, or 1369-1966 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 

25 region" of the coding strand of a nucleotide sequence of the invention. The term "coding 
region" refers to the region of the nucleotide sequence comprising codons which are 
translated into amino acid residues. In another embodiment, the antisense nucleic acid 
molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence 
of the invention. The term "noncoding region" refers to 5* and 3 1 sequences that flank the 

30 coding region that are not translated into amino acids (*.& , also referred to as 5' and 3 1 
untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein 
SEQ ID NO: 1-684, or 1369-1966, antisense nucleic acids of the invention can be designed 
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according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic 
acid molecule can be complementary to the entire coding region of an mRNA, but more 
preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
region of an mRNA. For example, the antisense oligonucleotide can be complementary to 
5 the region surrounding the translation start site of an mRNA. An antisense oligonucleotide 
can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An 
antisense nucleic acid of the invention can be constructed using chemical synthesis or 
enzymatic ligation reactions using procedures known in the art. For example, an antisense 
nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using 

10 naturally occurring nucleotides or variously modified nucleotides designed to increase the 
biological stability of the molecules or to increase the physical stability of the duplex formed 
between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and 
acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic 

15 acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, 
xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5- 
carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, 
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 

1- methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3- 
20 methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5- 

methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 
5 '-methoxycarboxyrnethyluracii, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, 
uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl- 

2- thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, 
25 uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, 

(acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced 
biologically using an expression vector into which a nucleic acid has been subcloned in an 
antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an 
antisense orientation to a target nucleic acid of interest, described further in the following 
30 subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of 
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the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the 
case of an antisense nucleic acid molecule that binds to DNA duplexes, through specific 
interactions in the major groove of the double helix. An example of a route of 
5 administration of antisense nucleic acid molecules of the invention includes direct injection 
at a tissue site. Alternatively, antisense nucleic acid molecules can be modified to target 
selected cells and then administered systemically. For example, for systemic administration, 
antisense molecules can be modified such that they specifically bind to receptors or antigens 
expressed on a selected cell surface, by linking the antisense nucleic acid molecules to 
10 peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic 
acid molecules can also be delivered to cells using the vectors described herein. To achieve 
sufficient intracellular concentrations of antisense molecules, vector constructs in which the 
antisense nucleic acid molecule is placed under the control of a strong pol II or pol EI 
promoter are preferred. 

15 In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual a-units, 
the strands run parallel to each other (Gaultier et al (1987) Nucleic Acids Res 15: 
6625-6641). The antisense nucleic acid molecule can also comprise a 

20 2'-o-methylribonucleotide (Inoue et al (1987) Nucleic Acids Res 15: 6131-6148) or a 
chimeric RNA -DNA analogue (Inoue et al (1987) FEES Lett 215: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 

25 Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of 
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a 
complementary region. Thus, ribozymes (e.g. 9 hammerhead ribozymes (described in 
Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave 
mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having specificity 

30 for a nucleic acid of the invention can be designed based upon the nucleotide sequence of a 
DNA disclosed herein (/.<?., SEQ ID NO: 1-684, or 1369-1966). For example, a derivative of 
Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the 
active site is complementary to the nucleotide sequence to be cleaved in a mRNA. See, e.g. 9 
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Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,1 16,742. Alternatively, 
mRNA of the invention can be used to select a catalytic RNA having a specific ribonuclease 
activity from a pool of RNA molecules. See, e.g., Bartel et al, (1993) Science - 
261:1411-1418. 

5 Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple 
helical structures that prevent transcription of the gene in target cells. See generally, Helene. 
(1991) Anticancer Drug Des. 6: 569-84; Helene. etal (1992) Ann. N.Y. Acad. Set 
660:27-36; andMaher (1992) Bioassays 14: 807-15. 

10 In various embodiments, the nucleic acids of the invention can be modified at the 

base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, 
hybridization, or solubility of the molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate peptide nucleic acids (see Hyrup 
et al (1996) BioorgMed Client 4: 5-23). As used herein, the terms "peptide nucleic acids" 

15 or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the deoxyribose 
phosphate backbone is replaced by a pseudopeptide backbone and only the four natural 
nucleobases are retained. The neutral backbone of PNAs has been shown to allow for 
specific hybridization to DNA and RNA under conditions of low ionic strength. The 
synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis 

20 protocols as described in Hyrup et al. (1996) above; Perry-O f Keefe et al. (1996) PNAS 93: 
14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation 
of gene expression by, e.g., inducing transcription or translation arrest or inhibiting 

25 replication. PNAs of the invention can also be used, e.g., in the analysis of single base pair 
mutations in a gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes 
when used in combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); 
or as probes or primers for DNA sequence and hybridization (Hyrup et al. (1996), above; 
Perry-O'Keefe (1996), above). 

30 In another embodiment, PNAs of the invention can be modified, e.g., to enhance 

their stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by 
the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
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combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA 
portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms of 
5 base stacking, number of bonds between the nucleobases, and orientation (Hyrup (1996) 
above). The synthesis of PNA-DNA chimeras can be performed as described in Hyrup 
(1996) above and Finn et al (1996) Nucl Acids Res 24: 3357-63. For example, a DNA chain 
can be synthesized on a solid support using standard phosphoramidite coupling chemistry, 
and modified nucleoside analogs, e.g., 5 l -(4-methoxytrityl)amino-5 t -deoxy-thymidine 

10 phosphoramidite, can be used between the PNA and the 5* end of DNA (Mag et al (1989) 
Nucl Acid Res 17: 5973-88). PNA monomers are then coupled in a stepwise manner to 
produce a chimeric molecule with a 5 ! PNA segment and a 3' DNA segment (Finn et al. 
(1996) above). Alternatively, chimeric molecules can be synthesized with a 5 1 DNA 
segment and a 3' PNA segment. See, Petersen et al (1975) Bioorg Med Chem Lett 5: 

15 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such 
as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport 
across the cell membrane (see, e.g., Letsinger et al, 1989, Proc. Natl. Acad. Set U.S.A. 
86:6553-6556; Lemaitre et al, 1987, Proc. Natl Acad. Sci. 84:648-652; PCT Publication 

20 No. W088/098 10) or the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). 
In addition, oligonucleotides can be modified with hybridization triggered cleavage agents 
(See, e.g., Krol et al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., 
Zon, 1988, Pharm. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to 
another molecule, e.g., a peptide, a hybridization triggered cross-linking agent, a transport 

25 agent, a hybridization-triggered cleavage agent, etc. 



4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain 
the polynucleotides of the invention. For example, such host cells may contain nucleic acids 
30 of the invention introduced into the host cell using known transformation, transfection or 
infection methods. The present invention still further provides host cells genetically 
engineered to express the polynucleotides of the invention, wherein such polynucleotides are 
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in operative association with a regulatory sequence heterologous to the host cell which 
drives expression of the polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in 
whole or in part, the naturally occurring promoter with all or part of a heterologous promoter 
so that the cells express the polypeptide at higher levels. The heterologous promoter is 
inserted in such a manner that it is operatively linked to the encoding sequences. See, for 
example, PCT International Publication No. WO94/12650, PCT International Publication 

10 No. WO92/20808, and PCT International Publication No. WO91/09955. It is also 

contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA 
(e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate 
synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be 
inserted along with the heterologous promoter DNA. If linked to the coding sequence, 

1 5 amplification of the marker DNA by standard selection methods results in co-amplification 
of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 

20 calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation 
(Davis, L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one 
of the polynucleotides of the invention, can be used in conventional manners to produce the 
gene product encoded by the isolated fragment (in the case of an ORF) or can be used to 
produce a heterologous protein under the control of the EMF. 

25 Any host/vector system can be used to express one or more of the ORFs of the 

present invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, 
Cv-1 cell, COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and 
B. subtilis. The most preferred cells are those which do not normally express the particular 
polypeptide or protein or which expresses the polypeptide or protein at low natural level. 

30 Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under 
the control of appropriate promoters. Cell-free translation systems can also be employed to 
produce such proteins using RNAs derived from the DNA constructs of the present 
invention. Appropriate cloning and expression vectors for use with prokaryotic and 



WO 2004/080148 



PCT/US2003/030720 



29 

eukaryotic hosts are described by Sambrook, et al,, in Molecular Cloning: A Laboratory 
Manual, Second Edition, Cold Spring Harbor, New York (1989), the disclosure of which is 
hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express 
5 recombinant protein. Examples of mammalian expression systems include the COS-7 lines 
of monkey kidney fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines 
capable of expressing a compatible vector are, for example, the C127, monkey COS cells, 
Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A431 cells, 
human Cok>205 cells, 3T3 cells, CV-1 cells, other transformed primate cell lines, normal 

10 diploid cells, cell strains derived from in vitro culture of primary tissue, primary explants, 
HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or Jurkat cells. Mammalian expression 
vectors will comprise an origin of replication, a suitable promoter and also any necessary 
ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional 
termination sequences, and 5' flanking nontranscribed sequences. DNA sequences derived 

15 from the SV40 viral genome, for example, SV40 origin, early promoter, enhancer, splice, 
• and polyadenylation sites may be used to provide the required nontranscribed genetic 
elements. Recombinant polypeptides and proteins produced in bacterial culture are usually 
isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous 
ion exchange or size exclusion chromatography steps. Protein refolding steps can be used, 

20 as necessary, in completing configuration of the mature protein. Finally, high performance 
liquid chromatography (HPLC) can be employed for final purification steps. Microbial cells 
employed in expression of proteins can be disrupted by any convenient method, including 
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as 

25 yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, 
or any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or 

30 bacteria, it may be necessary to modify the protein produced therein, for example by 

phosphorylation or glycosylation of the appropriate sites, in order to obtain the functional 
protein. Such covalent attachments may be accomplished using known chemical or 
enzymatic methods. 
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In another embodiment of the present invention, cells and tissues may be engineered 
to express an endogenous gene comprising the polynucleotides of the invention under the 
control of inducible regulatory elements, in which case the regulatory sequences of the 
endogenous gene may be replaced by homologous recombination. As described herein, gene 
5 targeting can be used to replace a gene's existing regulatory region with a regulatory 
sequence isolated from a different gene or a novel regulatory sequence synthesized by 
genetic engineering methods. Such regulatory sequences may be comprised of promoters, 
enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional 
initiation sites, and regulatory protein binding sites or combinations of said sequences. 

10 Alternatively, sequences which affect the structure or stability of the RNA or protein 
produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequence include polyadenylation signals, mRNA stability elements, splice sites, leader 
sequences for enhancing or modifying transport or secretion properties of the protein, or 
other sequences which alter or improve the function or stability of protein or RNA 

15 molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g„ inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 
deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 

20 element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different 
cell-type specificity than the naturally occurring elements. Here, the naturally occurring 
sequences are deleted and new sequences are added. In all cases, the identification of the 
targeting event may be facilitated by the use of one or more selectable marker genes that are 

25 contiguous with the targeting DNA, allowing for the selection of cells in which the 

exogenous DNA has integrated into the host cell genome. The identification of the targeting 
event may also be facilitated by the use of one or more marker genes exhibiting the property 
of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting 

30 sequence, and such that a correct homologous recombination event with sequences in the 
host cell genome does not result in the stable integration of the negatively selectable marker. 
Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) 
gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene. 
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The gene targeting or gene activation techniques which can be used in accordance 
with this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 
to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al; and International Application No. 
5 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by 
reference herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a 

10 polypeptide comprising: the amino acid sequences set forth as any one of SEQ ID NO: 685- 
1368, or 1967-2564 or an amino acid sequence encoded by any one of the nucleotide 
sequences SEQ ID NO: 1-684, or 1369-1966 or the corresponding full length or mature 
protein. Polypeptides of the invention also include polypeptides preferably with biological or 
immunological activity that are encoded by: (a) a polynucleotide having any one of the 

15 nucleotide sequences set forth in SEQ ID NO: 1-684, or 1369-1966 or (b) polynucleotides 
encoding any one of the amino acid sequences set forth as SEQ ED NO: 685-1368, or 1967- 
2564 or (c) polynucleotides that hybridize to the complement of the polynucleotides of either 
(a) or (b) under stringent hybridization conditions. The invention also provides biologically 
active or immunologically active variants of any of the amino acid sequences set forth as 

20 SEQ ID NO: 685-1368, or 1967-2564 or the corresponding full length or mature protein; and 
"substantial equivalents" thereof (e.g., with at least about 65%, at least about 70%, at least 
about 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%, at least about 
90%, 91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more typically at least 
about 98%, or most typically at least about 99% amino acid identity) that retain biological 

25 activity. Polypeptides encoded by allelic variants may have a similar, increased, or 

decreased activity compared to polypeptides comprising SEQ ID NO: 685-1368, or 1967- 
2564. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein 
30 may be in linear form or they may be cyclized using known methods, for example, as 
described in H. U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. 
McDowell, et al., J. Amer. Chem. Soc. 1 14, 9245-92*3 (1992), both of which are 
incorporated herein by reference. Such fragments may be fused to carrier molecules such as 
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immunoglobulins for many purposes, including increasing the valency of protein binding 
sites. Fragments are also identified in Tables 3A, 3B, 5, 7, or 8. 

The present invention also provides both fiill-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein 
5 coding sequence is identified in the sequence listing by translation of the disclosed 

nucleotide sequences. The predicted signal sequence is set forth in Table 5. The mature 
form of such protein may be obtained and confirmed by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell and sequencing of the cleaved 
product One of skill in the art will recognize that the actual cleavage site may be different 

10 than that predicted in Table 5. The sequence of the mature form of the protein is also 

determinable from the amino acid sequence of the fiill-length form. Where proteins of the 
present invention are membrane bound, soluble forms of the proteins are also provided. In 
such forms, part or all of the regions causing the proteins to be membrane bound are deleted 
so that the proteins are fully secreted from the cell in which they are expressed (See, e.g., 

1 5 Sakal et aL, Prep. Biochem. Biotechnol. (2000), 30(2), pp. 107-23, incorporated herein by 
reference). 

Protein compositions of the present invention may further comprise an acceptable 
carrier, such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic 

20 acid fragments of the present invention or by degenerate variants of the nucleic acid 
fragments of the present invention. By "degenerate variant' 1 is intended nucleotide 
fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) 
by nucleotide sequence but, due to the degeneracy of the genetic code, encode an identical 
polypeptide sequence. Preferred nucleic acid fragments of the present invention are the 

25 ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino 
acid sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or 

30 tertiary, structural and/or conformational characteristics with proteins may possess biological 
properties in common therewith, including protein activity. This technique is particularly 
useful in producing small peptides and fragments of larger polypeptides. Fragments are 
useful, for example, in generating antibodies against the native polypeptide. Thus, they may 
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be employed as biologically active or immunological substitutes for natural, purified 
proteins in screening of therapeutic compounds and in immunological processes for the 
development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified 
5 from cells which have been altered to express the desired polypeptide or protein. As used 
herein, a cell is said to be altered to express a desired polypeptide or protein when the cell, 
through genetic manipulation, is made to produce a polypeptide or protein which it normally 
does not produce or which the cell normally produces at a lower level. One skilled in the art 
can readily adapt procedures for introducing and expressing either recombinant or synthetic 

10 sequences into eukaryotic or prokaiyotic cells in order to generate a cell which produces one 
of the polypeptides or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising 
growing a culture of host cells of the invention in a suitable culture medium, and purifying 
the protein from the cells or the culture in which the cells are grown. For example, the 

1 5 methods of the invention include a process for producing a polypeptide in which a host cell 
containing a suitable expression vector that includes a polynucleotide of the invention is 
cultured under conditions that allow expression of the encoded polypeptide. The 
polypeptide can be recovered from the culture, conveniently from the culture medium, or 
from a lysate prepared from the host cells and further purified. Preferred embodiments 

20 include those in which the protein produced by such process is a full length or mature form 
of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells 
which naturally produce the polypeptide or protein. One skilled in the art can readily follow 
known methods for isolating polypeptides and proteins in order to obtain one of the isolated 
25 polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange 
chromatography, and immuno-affinity chromatography. See, e.g., Scopes, Protein 

Purification: Principles and Practice, Springer- Verlag (1994); Sambrook, et al., in 

♦ 

Molecular Cloning: A Laboratory Manual; Ausubel et al., Current Protocols in Molecular 
30 Biology. Polypeptide fragments that retain biologicaVimmunological activity include 

fragments comprising greater than about 100 amino acids, or greater than about 200 amino 
acids, and fragments that encode specific protein domains. 
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The purified polypeptides can be used in in vitro binding assays which are well 
known in the art to identify molecules which bind to the polypeptides. These molecules 
include but are not limited to, for e.g., small molecules, molecules from combinatorial 
libraries, antibodies or other proteins. The molecules identified in the binding assay are then 
5 tested for antagonist or agonist activity in in vivo tissue culture or animal models that are 
well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or 
animals and then tested for either cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the 
peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds that 
10 are toxic to cells. The toxin-binding molecule complex is then targeted to a tumor or other 
cell by the specificity of the binding molecule for SEQ ID NO: 685-1368, or 1967-2564. 

The protein of the invention may also be expressed as a product of transgenic 
animals, e.g. t as a component of the milk of transgenic cows, goats, pigs, or sheep which are 
characterized by somatic or germ cells containing 'a nucleotide sequence encoding the 
15 protein. 

The proteins provided herein also include proteins characterized by amino acid 
sequences similar to those of purified proteins but into which modification are naturally 
provided or deliberately engineered. For example, modifications, in the peptide or DNA 
sequence, can be made by those skilled in the art using known techniques. Modifications of 

20 interest in the protein sequences may include the alteration, substitution, replacement, 

insertion or deletion of a selected amino acid residue in the coding sequence. For example, 
one or more of the cysteine residues may be deleted or replaced with another amino acid to 
alter the conformation of the molecule. Techniques for such alteration, substitution, 
replacement, insertion or deletion are well known to those skilled in the art (see, e.g., U.S. 

25 Pat. No. 4,5 18,584). Preferably, such alteration, substitution, replacement, insertion or 

deletion retains the desired activity of the protein. Regions of the protein that are important 
for the protein Amotion can be determined by various methods known in the art including the 
alanine-scanning method which involved systematic substitution of single or strings of 
amino acids with alanine, followed by testing the resulting alanine-containing variant for 

30 biological activity. This type of analysis determines the importance of the substituted amino 
acid(s) in biological activity. Regions of the protein that are important for protein function 
may be determined by the eMATRIX program. 
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Other fragments and derivatives of the sequences of proteins which would be 
expected to retain protein activity in whole or in part and are useful for screening or other 
immunological methodologies may also be easily made by those skilled in the art given the 
disclosures herein. Such modifications are encompassed by the present invention. 
5 The protein may also be produced by operably linking the isolated polynucleotide of 

the invention to suitable control sequences in one or more insect expression vectors, and 
employing an insect expression system. Materials and methods for baculovirus/insect cell 
expression systems are commercially available in kit form from, e.g. 9 Invitrogen, San Diego, 
Calif, U.S.A. (the MaxBat™ kit), and such methods are well known in the art, as described 

10 in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), 
incorporated herein by reference. As used herein, an insect cell capable of expressing a 
polynucleotide of the present invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells 
under culture conditions suitable to express the recombinant protein. The resulting 

15 expressed protein may then be purified from such culture (ie., from culture medium or cell 
extracts) using known purification processes, such as gel filtration and ion exchange 
chromatography. The purification of the protein may also include an affinity column 
containing agents which will bind to the protein; one or more column steps over such affinity 
resins as concanavalin A-agarose, heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; 

20 one or more steps involving hydrophobic interaction chromatography using such resins as 
phenyl ether, butyl ether, or propyl ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as 

25 a His tag. Kits for expression and purification of such fusion proteins are commercially 
available from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and 
Invitrogen, respectively. The protein can also be tagged with an epitope and subsequently 
purified by using a specific antibody directed to such epitope. One such epitope ("FLAG®") 
is commercially available from Kodak (New Haven, Conn.). 

30 Finally, one or more reverse-phase high performance liquid chromatography (RP- 

HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant 
methyl or other aliphatic groups, can be employed to further purify the protein. Some or all 
of the foregoing purification steps, in various combinations, can also be employed to provide 
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a substantially homogeneous isolated recombinant protein. The protein thus purified is 
substantially free of other mammalian proteins and is defined in accordance with the present 
invention as an "isolated protein." 

The polypeptides of the invention include analogs (variants). This embraces 
5 fragments, as well as peptides in which one or more amino acids has been deleted, inserted, 
or substituted. Also, analogs of the polypeptides of the invention embrace fusions of the 
polypeptides or modifications of the polypeptides of the invention, wherein the polypeptide 
or analog is fused to another moiety or moieties, e.g., targeting moiety or another therapeutic 
agent. Such analogs may exhibit improved properties such as activity and/or stability. 

10 Examples of moieties which may be fused to the polypeptide or an analog include, for 

example, targeting moieties which provide for the delivery of polypeptide to pancreatic cells, 
e.g., antibodies to pancreatic cells, antibodies to immune cells such as T-cells, monocytes, 
dendritic cells, granulocytes, etc., as well as receptor and ligands expressed on pancreatic or 
immune cells. Other moieties which may be fused to the polypeptide include therapeutic 

1 5 agents which are used for treatment, for example, immunosuppressive drugs such as 

cyclosporin, SK506, azathioprine, CD3 antibodies and steroids. Also, polypeptides may be 
fused to immune modulators, and other cytokines such as alpha or beta interferon. 

4,6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE 
20 IDENTITY AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between 
the sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 

25 University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, 
S.F. et al., J. Molec. Biol 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al, Nucleic 
Acids Res. vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu 
et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif 
software (Nevill-Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by 

30 reference), Pfam software (Sonnhammer et al, Nucleic Acids Res., Vol. 26(1), pp. 320-322 
(1998), herein incorporated by reference) and the Kyte-Doolittle hydrophobocity prediction 
algorithm (J. Mol Biol, 157, pp. 105-31 (1982), the GeneAtlas software (Molecular 
Simulations Inc. (MSI), San Diego, CA) (Sanchez and Sali (1998) Proc. Natl. Acad. Sci., 95, 
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modeling - an evaluation" Submitted; Fischer and Eisenberg (1996) Protein Sci. 5, 947- 
955), Neural Network SignalP VI. 1 program (from Center for Biological Sequence 
Analysis, The Technical University of Denmark) incorporated herein by reference). 

5 Polypeptide sequences were examined by a proprietary algorithm, SeqLoc that separates the 
proteins into three sets of locales: intracellular, membrane, or secreted. This prediction is 
based upon three characteristics of each polypeptide, including percentage of cysteine 
residues, Kyte-Doolittle scores for the first 20 amino acids of each protein, and Kyte- 
Doolittle scores to calculate the longest hydrophobic stretch of the said protein. Values of 

10 predicted proteins are compared against the values from a set of 592 proteins of known 

cellular localization from the Swissprot database (http://www.expasv.ch/sprot) . Predictions 
are based upon the maximum likelihood estimation. 

Pesence of transmembrane region(s) was detected using the TMpred program 
flittp://ww.ch.embnet.or^software/TMPRED form.html) . 

1 5 The BLAST programs are publicly available from the National Center for 

Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul, S., et al. 
NCBI NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. Biol. 215:403-410 
(1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

20 The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 

25 invention. In another embodiment, a fusion protein comprises at least two biologically 
active portions of a protein according to the invention. Within the fusion protein, the term 
"operatively linked" is intended to indicate that the polypeptide according to the invention 
and the other polypeptide are fused in-frame to each other. The polypeptide can be fused to 
the N-terminus or C-terminus, or to the middle. 

30 For example, in one embodiment a fusion protein comprises a polypeptide according 

to the invention operably linked to the extracellular domain of a second protein. 
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In another embodiment, the fusion protein is a GST-fusion protein in which the 
polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., 
glutathione S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in 

5 which the polypeptide sequences according to the invention comprise one or more domains 
fused to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in 

10 vivo. The immunoglobulin fusion proteins can be used to affect the bioavailability of a 

cognate ligand. Inhibition of the ligand/protein interaction may be useful therapeutically for 
both the treatment of proliferative and differentiative disorders, e.g., cancer as well as 
modulating (e.g., promoting or inhibiting) cell survival. Moreover, the immunoglobulin 
fusion proteins of the invention can be used as immunogens to produce antibodies in a 

1 5 subject, to purify ligands, and in screening assays to identify molecules that inhibit the 
interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 

20 techniques, e.g. , by employing blunt-ended or stagger-ended termini for ligation, restriction 
enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as 
appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic 
ligation. In another embodiment, the fusion gene can be synthesized by conventional 
techniques including automated DNA synthesizers. Alternatively, PCR amplification of 

25 gene fragments can be carried out using anchor primers that give rise to complementary 
overhangs between two consecutive gene fragments that can subsequently be annealed and 
reamplified to generate a chimeric gene sequence (see, for example, Ausubel et al. (eds.) 
Current Protocols in Molecular Biology, John Wiley & Sons, 1992). Moreover, 
many expression vectors are commercially available that already encode a fusion moiety 

30 (e.g. , a GST polypeptide). A nucleic acid encoding a polypeptide of the invention can be 
cloned into such an expression vector such that the fusion moiety is linked in-frame to the 
protein of the invention. 
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4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides 
5 of the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more 
particularly viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo 
by use of physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for 
example, Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For 

10 additional reviews of gene therapy technology see Friedmann, Science, 244: 1275-1281 

(1989); Verma, Scientific American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). 
Introduction of any one of the nucleotides of the present invention or a gene encoding the 
polypeptides of the present invention can also be accomplished with extrachromosomal 
substrates (transient expression) or artificial chromosomes (stable expression). Cells may 

15 also be cultured ex vivo in the presence of proteins of the present invention in order to 
' proliferate or to produce a desired effect on or activity in such cells. Treated cells can then 
be introduced in vivo for therapeutic purposes. Alternatively, it is contemplated that in other 
human disease states, preventing the expression of or inhibiting the activity of polypeptides 
of the invention will be useful in treating the disease states. It is contemplated that antisense 

20 therapy or gene therapy could be applied to negatively regulate the expression of 
polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated 
RNA sequences, by methods known in the art. Further, the polypeptides of the present 

25 invention can be inhibited by using targeted deletion methods, or the insertion of a negative 
regulatory element such as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to 
express the polynucleotides of the invention, wherein such polynucleotides are in operative 
association with a regulatory sequence heterologous to the host cell which drives expression of 

30 the polynucleotides in the cell. These methods can be used to increase or decrease the 
expression of the polynucleotides of the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of 
cells to permit, increase, or decrease, expression of endogenous polypeptide. Cells can be 
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modified (e.g., by homologous recombination) to provide increased polypeptide expression by 
replacing, in whole or in part, the naturally occurring promoter with all or part of a heterologous 
promoter so that the cells express the protein at higher levels. The heterologous promoter is 
inserted in such a manner that it is operatively linked to the desired protein encoding sequences. 
5 See, for example, PCT International Publication No. WO 94/12650, PCT International 

Publication No. WO 92/20808, and PCT International Publication No. WO 91/09955. It is also 
contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA (e.g., 
ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate synthase, 
aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be inserted along with 

1 0 the heterologous promoter DNA. If linked to the desired protein coding sequence, 

amplification of the marker DNA by standard selection methods results in co-amplification of 
the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control 

15 of inducible regulatory elements, in which case the regulatory sequences of the endogenous 
gene may be replaced by homologous recombination. As described herein, gene targeting can 
be used to replace a gene's existing regulatory region with a regulatory sequence isolated from 
a different gene or a novel regulatory sequence synthesized by genetic engineering methods. 
Such regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment 

20 regions, negative regulatory elements, transcriptional initiation sites, regulatory protein binding 
sites or combinations of said sequences. Alternatively, sequences which affect the structure or 
stability of the RNA or protein produced may be replaced, removed, added, or otherwise 
modified by targeting. These sequences include polyadenylation signals, mRNA stability 
elements, splice sites, leader sequences for enhancing or modifying transport or secretion 

25 properties of the protein, or other sequences which alter or improve the function or stability of 
protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 

30 deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 
element Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different cell-type 
specificity than the naturally occurring elements. Here, the naturally occurring sequences are 
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deleted and new sequences are added. In all cases, the identification of the targeting event may 
be facilitated by the use of one or more selectable marker genes that are contiguous with the 
targeting DNA, allowing for the selection of cells in which the exogenous DNA has integrated 
into the cell genome. The identification of the targeting event may also be facilitated by the use 
5 of one or more marker genes exhibiting the property of negative selection, such that the 
negatively selectable marker is linked to the exogenous DNA, but configured such that the 
negatively selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 

1 0 Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al; International Application No. 

1 5 PCT/US92/09627 (WO93/09222) by Selden et al; and International Application No. 
PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by 
reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

20 In preferred methods to determine biological functions of the polypeptides of the 

invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 

25 Animals in which an endogenous gene has been inactivated by homologous recombination 
are referred to as "knockout" animals. Knockout animals, preferably non-human mammals, 
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. 
Transgenic animals are useful to determine the roles polypeptides of the invention play in 
biological processes, and preferably in disease states. Transgenic animals are useful as model 

30 systems to identify compounds that modulate lipid metabolism. Transgenic animals, 

preferably non-human mammals, are produced using methods as described in U.S. Patent No 
5,489,743 and PCT Publication No. W094/28122, incorporated herein by reference. 
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Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of 
expression of the polypeptides of the invention. Inactivation can be carried out using 
homologous recombination methods described above. Activation can be achieved by 
5 supplementing or even replacing the homologous promoter to provide for increased protein 
expression. The homologous promoter can be supplemented by insertion of one or more 
heterologous enhancer elements known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to 

10 express polypeptides of the invention or that express a variant polypeptide. Such animals are 
useful as models for studying the in vivo activities of polypeptide as well as for studying 
modulators of the polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

1 5 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination 
are referred to as "knockout" animals. Knockout animals, preferably non-human mammals, 

20 can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. 
Transgenic animals are useful to determine the roles polypeptides of the invention play in 
biological processes, and preferably in disease states. Transgenic animals are useful as model 
systems to identify compounds that modulate lipid metabolism. Transgenic animals, 
preferably non-human mammals, are produced using methods as described in U.S. Patent No 

25 5,489,743 and PCT Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or 

30 even replacing the homologous promoter to provide for increased protein expression. The 
homologous promoter can be supplemented by insertion of one or more heterologous 
enhancer elements known to confer promoter activation in a particular tissue. 
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4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one 
or more of the uses or biological activities (including those associated with assays cited 
herein) identified herein. Uses or activities described for proteins of the present invention 
5 may be provided by administration or use of such proteins or of polynucleotides encoding 
such proteins (such as, for example, in gene therapies or vectors suitable for introduction of 
DNA). The mechanism underlying the particular condition or pathology will dictate whether 
the polypeptides of the invention, the polynucleotides of the invention or modulators 
(activators or inhibitors) thereof would be beneficial to the subject in need of treatment. 

10 Thus, "therapeutic compositions of the invention" include compositions comprising isolated 
polynucleotides (including recombinant DNA molecules, cloned genes and degenerate 
variants thereof) or polypeptides of the invention (including full length protein, mature 
protein and truncations or domains thereof), or compounds and other substances that 
modulate the overall activity of the target gene products, either at the level of target 

1 5 gene/protein expression or target protein activity. Such modulators include polypeptides, 
analogs, (variants), including fragments and fusion proteins, antibodies and other binding 
proteins; chemical compounds that directly or indirectly activate or inhibit the polypeptides 
of the invention (identified, e.g., via drug screening assays as described herein); antisense 
polynucleotides and polynucleotides suitable for triple helix formation; and in particular 

20 antibodies or other binding partners that specifically recognize one or more epitopes of the 
polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular 
activation or in one of the other physiological pathways described herein. 

25 4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage 
30 of tissue differentiation or development or in disease states); as molecular weight markers on 
gels; as chromosome markers or tags (when labeled) to identify chromosomes or to map 
related gene positions; to compare with endogenous DNA sequences in patients to identify 
potential genetic disorders; as probes to hybridize and thus discover novel, related DNA 
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sequences; as a source of information to derive PCR primers for genetic fingerprinting; as a 
probe to "subtract-out" known sequences in the process of discovering other novel 
polynucleotides; for selecting and making oligomers for attachment to a "gene chip" or other 
support, including for examination of expression patterns; to raise anti-protein antibodies 
5 using DNA immunization techniques; and as an antigen to raise anti-DNA antibodies or 
elicit another immune response. Where the polynucleotide encodes a protein which binds or 
potentially binds to another protein (such as, for example, in a receptor-ligand interaction), 
the polynucleotide can also be used in interaction trap assays (such as, for example, that 
described in Gyuris et al., Cell 75:791-803 (1993)) to identify polynucleotides encoding the 

10 other protein with which binding occurs or to identify inhibitors of the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including 
the labeled reagent) in assays designed to quantitatively determine levels of the protein (or 

15 its receptor) in biological fluids; as markers for tissues in which the corresponding 

polypeptide is preferentially expressed (either constitutively or at a particular stage of tissue 
differentiation or development or in a disease state); and, of course, to isolate correlative 
receptors or ligands. Proteins involved in these binding interactions can also be used to 
screen for peptide or small molecule inhibitors or agonists of the binding interaction. 

20 Any or all of these research utilities are capable of being developed into reagent 

grade or kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the 
art. References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. 

25 Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular 
Cloning Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as 
30 nutritional sources or supplements. Such uses include without limitation use as a protein or 
amino acid supplement, use as a carbon source, use as a nitrogen source and use as a source of 
carbohydrate. In such cases the polypeptide or polynucleotide of the invention can be added to 
the feed of a particular organism or can be administered as a separate solid or liquid 
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preparation, such as in the form of powder, pills, solutions, suspensions or capsules, hi the case 
of microorganisms, the polypeptide or polynucleotide of the invention can be added to the 
medium in or on which the microorganism is cultured. 

5 4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 

ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or 
inhibiting) activity or may induce production of other cytokines in certain cell populations. 

10 A polynucleotide of the invention can encode a polypeptide exhibiting such attributes. 
Many protein factors discovered to date, including all known cytokines, have exhibited 
activity in one or more factor-dependent cell proliferation assays, and hence the assays serve 
as a convenient confirmation of cytokine activity. The activity of therapeutic compositions 
of the present invention is evidenced by any one of a number of routine factor dependent cell 

15 proliferation assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, 
B9/1 1, BaF3, MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, 
Mo7e, CMK, HUVEC, and Caco. Therapeutic compositions of the invention can be used in 
the following: 

Assays for T-cell or thymocyte proliferation include without limitation those 
20 described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 

Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 

Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; 

Bertagnolli et al., J. Immunol. 145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 
25 133:327-341, 1991; Bertagnolli, et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. 

Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells 

or thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 

Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
30 eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of 

mouse and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. 

e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 
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Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine 
Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current 
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and 
5 Sons, Toronto. 1991; deVries et aL, J. Exp. Med. 173:1205-121 1, 1991; Moreau et aL, 
Nature 336:690-692, 1988; Greenberger et aL, Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 
1983; Measurement of mouse and human interleukin 6-Nordan, R. In Current Protocols in 
Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; 
Smith et aL, Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; Measurement of human 

10 Interleukin 1 1-Bennett, F., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols 
in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John Wiley and Sons, Toronto. 1991; 
Measurement of mouse and human Interleukin 9-Ciarletta, A., Giannotti, J., Clark, S. C. 
and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, 
John Wiley and Sons, Toronto. 1991. 

15 Assays for T-cell clone responses to antigens (which will identify, among others, 

proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring 
proliferation and cytokine production) include, without limitation, those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, 
E. M, Shevach, W Strober, Pub. Greene Publishing Associates and Wiley-Interscience 

20 (Chapter 3, In Vitro assays for Mouse Lymphocyte Function; Chapter 6, Cytokines and their 
cellular receptors; Chapter 7, Immunologic studies in Humans); Weinberger et aL, Proc. 
Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al„ Eur. J. Immun. 11:405-411, 
1981; Takai et aL, J. Immunol. 137:3494-3500, 1986; Takai et aL, J. Immunol. 140:508-512, 
1988. 

25 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity 
and be involved in the proliferation, differentiation and survival of pluripotent and totipotent 
stem cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells 
30 and/or germ line stem cells. Administration of the polypeptide of the invention to stem cells 
in vivo or ex vivo is expected to maintain and expand cell populations in a totipotential or 
pluripotential state which would be useful for re-engineering damaged or diseased tissues, 
transplantation, manufacture of bio-pharmaceuticals and the development of bio-sensors. 
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The ability to produce large quantities of human cells has important working applications for 
the production of human proteins which currently must be obtained from non-human sources 
or donors, implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other 
neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartilage, 
5 tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, 
gastrointestinal cells and others; and organs for transplantation such as kidney, liver, 
pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines 
may be administered in combination with the polypeptide of the invention to achieve the 
10 desired effect, including any of the growth factors listed herein, other stem cell maintenance 
factors, and specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), 
Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble IL-6 receptor fused to EL- 
6, macrophage inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, 
thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth factor (PDGF), 
1 5 neural growth factors and basic fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion 
of these cells in culture will facilitate the production of large quantities of mature cells. 
Techniques for culturing stem cells are known in the art and administration of polypeptides 
of the invention, optionally with other growth factors and/or cytokines, is expected to 
20 enhance the survival and proliferation of the stem cell populations. This can be 

accomplished by direct administration of the polypeptide of the invention to the culture 
medium. Alternatively, stroma cells transfected with a polynucleotide that encodes for the 
polypeptide of the invention can be used as a feeder layer for the stem cell populations in 
culture or in vivo. Stromal support cells for feeder layers may include embryonic bone 
25 marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or cultured embryonic 
fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to 
induce autocrine expression of the polypeptide of the invention. This will allow for 
generation of undifferentiated totipotential/pluripotential stem cell lines that are useful as is 
30 or that can then be differentiated into the desired mature cell types. These stable cell lines 
can also serve as a source of undifferentiated totipotential/pluripotential mRNA to create 
cDNA libraries and templates for polymerase chain reaction experiments. These studies 
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would allow for the isolation and identification of differentially expressed genes in stem cell 
populations that regulate stem cell proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present 
5 invention may be used to manipulate stem cells in culture to give rise to neuroepithelial cells 
that can be used to augment or replace cells damaged by illness, autoimmune disease, 
accidental damage or genetic disorders. The polypeptide of the invention may be useful for 
inducing the proliferation of neural cells and for the regeneration of nerve and brain tissue, 
i.e. for the treatment of central and peripheral nervous system diseases and neuropathies, as 

10 well as mechanical and traumatic disorders which involve degeneration, death or trauma to 
neural cells or nerve tissue. In addition, the expanded stem cell populations can also be 
genetically altered for gene therapy purposes and to decrease host rejection of replacement 
tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 

15 manipulated to achieve controlled differentiation of the stem cells into more differentiated 
cell types. A broadly applicable method of obtaining pure populations of a specific 
differentiated cell type from undifferentiated stem cell populations involves the use of a cell- 
type specific promoter driving a selectable marker. The selectable marker allows only cells 
of the desired type to survive. For example, stem cells can be induced to differentiate into 

20 cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. 
Invest, 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. W. In: Principles of 
Tissue Engineering eds. Lanza et al., Academic Press (1997)). Alternatively, directed 
differentiation of stem cells can be accomplished by culturing the stem cells in the presence 
of a differentiation factor such as retinoic acid and an antagonist of the polypeptide of the 

25 invention which would inhibit the effects of endogenous stem cell factor activity and allow 
differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the 
invention exhibits stem cell growth factor activity. Stem cells are isolated from any one of 
various cell sources (including hematopoietic stem cells and embryonic stem cells) and 

30 cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 
92: 7844-7848 (1995), in the presence of the polypeptide of the invention alone or in 
combination with other growth factors or cytokines. The ability of the polypeptide of the 



WO 2004/080148 



PCT/US2003/030720 



49 

invention to induce stem cells proliferation is determined by colony formation on semi-solid 
support e.g. as described by Bernstein et ah, Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOEESIS REGULATING ACTIVITY 
5 A polypeptide of the present invention may be involved in regulation of 

hematopoiesis and, consequently, in the treatment of myeloid or lymphoid celldisorders. 
Even marginal biological activity in support of colony forming cells or of factor-dependent 
cell lines indicates involvement in regulating hematopoiesis, e.g. in supporting the growth 
and proliferation of erythroid progenitor cells alone or in combination with other cytokines, 

10 thereby indicating utility, for example, in treating various anemias or for use in conjunction 
with inadiation/chemotherapy to stimulate the production of erythroid precursors and/or 
erythroid cells; in supporting the growth and proliferation of myeloid ceils such as 
granulocytes and monocytes/macrophages (i.e., traditional CSF activity) useful, for example, 
in conjunction with chemotherapy to prevent or treat consequent myelo-suppression; in 

1 5 supporting the growth and proliferation of megakaryocytes and consequently of platelets 
thereby allowing prevention or treatment of various platelet disorders such as 
thrombocytopenia, and generally for use in place of or complimentary to platelet 
transfusions; and/or in supporting the growth and proliferation of hematopoietic stem cells 
which are capable of maturing to any and all of the above-mentioned hematopoietic cells and 

20 therefore find therapeutic utility in various stem cell disorders (such as those usually treated 
with transplantation, including, without limitation, aplastic anemia and paroxysmal nocturnal 
hemoglobinuria), as well as in repopulating the stem cell compartment post 
irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or 

25 heterologous)) as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 
Suitable assays for proliferation and differentiation of various hematopoietic lines are 
cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
30 proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., 
Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al, Blood 81 :2903-2915, 
1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony foiming assays, Freshney, M. G. In Culture of Hematopoietic Cells. 
R. I. Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; 
5 Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-591 1 , 1992; Primitive hematopoietic 
colony forming cells with high proliferative potential, McNiece, I. K. and Briddell, R. A. In 
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., 
New York, N.Y. 1994; Neben et al., Experimental Hematology 22:353-359, 1994; 
Cobblestone area forming cell assay, Ploemacher, R. E. In Culture of Hematopoietic Cells. 

10 R. I. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, Inc., New York, N.Y. 1994; Long term 
bone marrow cultures in the presence of stromal cells, Spooncer, E., Dexter, M. and Allen, 
T. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, 
Inc., New York, N.Y. 1994; Long term culture initiating cell assay, Sutherland, H. J. In 
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 139-162, Wiley-Liss, Inc., 

15 New York, N.Y. 1994. 



4,10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, 
tendon, ligament and/or nerve tissue growth or regeneration, as well as in wound healing and 

20 tissue repair and replacement, and in healing of burns, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 
fiactures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 

25 prophylactic use in closed as well as open fracture reduction and also in the improved 
fixation of artificial joints. De novo bone formation induced by an osteogenic agent 
contributes to the repair of congenital, trauma induced, or oncologic resection induced 
craniofacial defects, and also is useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming 

30 cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by 
blocking inflammation or processes of tissue destruction (collagenase activity, osteoclast 
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activity, etc.) mediated by inflammatory processes may also be possible using the 
composition of the invention. 

Another category of tissue regeneration activity that may involve the polypeptide of 
the present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue 
5 or other tissue formation in circumstances where such tissue is not normally formed, has 
application in the healing of tendon or ligament tears, deformities and other tendon or 
ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use in preventing 
damage to tendon or ligament tissue, as well as use in the improved fixation of tendon or 

10 ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. De 
novo tendon/ligament-like tissue formation induced by a composition of the present 
invention contributes to the repair of congenital, trauma induced, or other tendon or ligament 
defects of other origin, and is also useful in cosmetic plastic surgery for attachment or repair 
of tendons or ligaments. The compositions of the present invention may provide 

15 environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to 
effect tissue repair. The compositions of the invention may also be useful in the treatment of 
tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The compositions 

20 may also include an appropriate matrix and/or sequestering agent as a carrier as is well 
known in the art. 

The compositions of the present invention may also be useful for proliferation of 
neural cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and 

25 traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve 
tissue. More specifically, a composition may be used in the treatment of diseases of the 
peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and 
localized neuropathies, and central nervous system diseases, such as Alzheimer's, 
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager 

30 syndrome. Further conditions which may be treated in accordance with the present invention 
include mechanical and traumatic disorders, such as spinal cord disorders, head trauma and 
cerebrovascular diseases such as stroke. Peripheral neuropathies resulting from 
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chemotherapy or other medical therapies may also be treatable using a composition of the 
invention. 

Compositions of the invention may also be useful to promote better or faster closure 
of non-healing wounds, including without limitation pressure ulcers, ulcers associated with 
5 vascular insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, 
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular 
(including vascular endothelium) tissue, or for promoting the growth of cells comprising 
10 such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic 

scarring may allow normal tissue to regenerate. A polypeptide of the present invention may 
also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
15 conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or 
inhibiting differentiation of tissues described above from precursor tissues or cells; or for 
inhibiting the growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 
20 Assays for tissue generation activity include, without limitation, those described in: 

International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International 
Patent Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: 
25 Winter, Epidermal Wound Healing, pps. 71-112 (Maibach, H. L and Rovee, D. T., eds.), 

Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. 
Dermatol 71:382-84 (1978). 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

3° A polypeptide of the present invention may also exhibit immune stimulating or 

immune suppressing activity, including without limitation the activities for which assays are 
described herein. A polynucleotide of the invention can encode a polypeptide exhibiting 
such activities. A protein may be useful in the treatment of various immune deficiencies and 
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disorders (including severe combined immunodeficiency (SCID)), e.g., in regulating (up or 
down) growth and proliferation of T and/or B lymphocytes, as well as effecting the cytolytic 
activity of NK cells and other cell populations. These immune deficiencies may be genetic or 
be caused by viral (e.g., HTV) as well as bacterial or fungal infections, or may result from 
5 autoimmune disorders. More specifically, infectious diseases causes by viral, bacterial, 
fungal or other infection may be treatable using a protein of the present invention, including 
infections by HIV, hepatitis viruses, herpes viruses, mycobacteria, Leishmania spp., malaria 
spp. and various fungal infections such as candidiasis. Of course, in this regard, proteins of 
the present invention may also be useful where a boost to the immune system generally may 

1 0 be desirable, i.e., in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus 
erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre 
syndrome, autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, 

1 5 graft-versus-host disease and autoimmune inflammatory eye disease. Such a protein (or 
antagonists thereof, including antibodies) of the present invention may also to be useful in 
the treatment of allergic reactions and conditions (e.g., anaphylaxis, serum sickness, drug 
reactions, food allergies, insect venom allergies, mastocytosis, allergic rhinitis, 
hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic dermatitis, allergic 

20 contact dermatitis, erythema multiforme, Stevens-Johnson syndrome, allergic conjunctivitis, 
atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant papillary conjunctivitis and 
contact allergies), such as asthma (particularly allergic asthma) or other respiratory 
problems. Other conditions, in which immune suppression is desired (including, for 
example, organ transplantation), may also be treatable using a protein (or antagonists 

25 thereof) of the present invention. The therapeutic effects of the polypeptides or antagonists 
thereof on allergic reactions can be evaluated by in vivo animals models such as the 
cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 1998), skin 
prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization test 
(Vohr et al, Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 

30 J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or 
blocking an immune response already in progress or may involve preventing the induction of 
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an immune response. The functions of activated T cells may be inhibited by suppressing T 
cell responses or by inducing specific tolerance in T cells, or both. Immunosuppression of T 
cell responses is generally an active, non-antigen-specific, process which requires continuous 
exposure of the T cells to the suppressive agent. Tolerance, which involves inducing 
5 non-responsiveness or anergy in T cells, is distinguishable from immunosuppression in that 
it is generally antigen-specific and persists after exposure to the tolerizing agent has ceased. 
Operationally, tolerance can be demonstrated by the lack of a T cell response upon 
reexposure to specific antigen in the absence of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

10 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin 
and organ transplantation and in graft-versus-host disease (GVHD). For example, blockage 
of T cell function should result in reduced tissue destruction in tissue transplantation. 
Typically, in tissue transplants, rejection of the transplant is initiated through its recognition 

15 as foreign by T cells, followed by an immune reaction that destroys the transplant. The 

administration of a therapeutic composition of the invention may prevent cytokine synthesis 
by immune cells, such as T cells, and thus acts as an immunosuppressant. Moreover, a lack 
of costimulation may also be sufficient to anergize the T cells, thereby inducing tolerance in 
a subject. Induction of long-term tolerance by B lymphocyte antigen-blocking reagents may 

20 avoid the necessity of repeated administration of these blocking reagents. To achieve 

sufficient immunosuppression or tolerance in a subject, it may also be necessary to block the 
function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 

25 humans. Examples of appropriate systems which can be used include allogeneic cardiac 
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been 
used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as 
described in Lenschow et al, Science 257:789-792 (1992) and Turka et aL, Proc. Natl. Acad. 
Sci USA, 89:1 1 102-1 1 105 (1992). In addition, murine models of GVHD (see Paul ed., 

30 Fundamental Immunology, Raven Press, New York, 1 989, pp. 846-847) can be used to 

determine the effect of therapeutic compositions of the invention on the development of that 
disease. 
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Blocking antigen function may also be therapeutically useful for treating 
autoimmune diseases. Many autoimmune disorders are the result of inappropriate activation 
of T cells that are reactive against self-tissue and which promote the production of cytokines 
and autoantibodies involved in the pathology of the diseases. Preventing the activation of 
5 autoreactive T cells may reduce or eliminate disease symptoms. Administration of reagents 
which block stimulation of T cells can be used to inhibit T cell activation and prevent 
production of autoantibodies or T cell-derived cytokines which may be involved in the 
disease process. Additionally, blocking reagents may induce antigen-specific tolerance of 
autoreactive T cells which could lead to long-term relief from the disease. The efficacy of 

10 blocking reagents in preventing or alleviating autoimmune disorders can be determined 
using a number of well-characterized animal models of human autoimmune diseases. 
Examples include murine experimental autoimmune encephalitis, systemic lupus 
erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune collagen 
arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental myasthenia 

15 gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a 
means of up regulating immune responses, may also be useful in therapy. Upregulation of 
immune responses may be in the form of enhancing an existing immune response or eliciting 

20 an initial immune response. For example, enhancing an immune response may be useful in 
cases of viral infection, including systemic viral diseases such as influenza, the common 
cold, and encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 

25 APCs either expressing a peptide of the present invention or together with a stimulatory 
form of a soluble peptide of the present invention and reintroducing the in vitro activated T 
cells into the patient. Another method of enhancing anti-viral immune responses would be to 
isolate infected cells from a patient, transfect them with a nucleic acid encoding a protein of 
the present invention as described herein such that the cells express all or a portion of the 

30 protein on their surface, and reintroduce the transfected cells into the patient. The infected 
cells would now be capable of delivering a costimulatory signal to, and thereby activate, T 
cells in vivo. 
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A polypeptide of the present invention may provide the necessary stimulation signal 
to T cells to induce a T cell mediated immune response against the transfected tumor cells. 
In addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected 
5 with nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) 
of an MHC class I alpha chain protein and p2 microglobulin protein or an MHC class II 
alpha chain protein and an MHC class II beta chain protein to thereby express MHC class I 
or MHC class II proteins on the cell surface. Expression of the appropriate class I or class II 
MHC in conjunction with a peptide having the activity of a B lymphocyte antigen (e.g., 

10 B7-1, B7-2, B7-3) induces a T cell mediated immune response against the transfected tumor 
cell. Optionally, a gene encoding an antisense construct which blocks expression of an MHC 
class II associated protein, such as the invariant chain, can also be cotransfected with a DNA 
encoding a peptide having the activity of a B lymphocyte antigen to promote presentation of 
tumor associated antigens and induce tumor specific immunity. Thus, the induction of a T 

1 5 cell mediated immune response in a human subject may be sufficient to overcome 
tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 

20 those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128: 1968-1974, 1982; Handa et al., J. 

25 Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., 
J. Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et 
al., Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 
1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching 
30 (which will identify, among others, proteins that modulate T-cell dependent antibody 

responses and that affect Thl/Th2 profiles) include, without limitation, those described in: 
Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro 
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antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. 
E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, 
proteins that generate predominantly Thl and CTL responses) include, without limitation, 
5 those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; 
Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 
10 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins 
expressed by dendritic cells that activate naive T-cells) include, without limitation, those 
described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et aL, Journal of 
Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of Immunology 

15 154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-260, 
1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al., Science 
264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169: 1255-1264, 
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and Inaba et al., 
Journal of Experimental Medicine 172:631-640, 1990. 

20 Assays for lymphocyte survival/apoptosis (which will identify, among others, 

proteins that prevent apoptosis after superantigen induction and proteins that regulate 
lymphocyte homeostasis) include, without limitation, those described in: Darzynkiewicz et 
al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et 
al., Cancer Research 53:1945-1951, 1993; Itohetal., Cell 66:233-243, 1991; Zacharchuk, 

25 Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; 
Gorczyca et al., International Journal of Oncology 1 :639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine 
et al., Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; 

30 Toki et al., Proc. Nat. Acad Sci. USA 88:7548-755 1, 1991 . 

4.10.8 ACTIVIN/INHEBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate 
5 the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present 

invention, alone or in heterodimers with a member of the inhibin family, may be useful as a 
contraceptive based on the ability of inhibins to decrease fertility in female mammals and 
decrease spermatogenesis in male mammals. Administration of sufficient amounts of other 
inhibins can induce infertility in these mammals. Alternatively, the polypeptide of the 

10 invention, as a homodimer or as a heterodimer with other protein subimits of the inhibin 
group, may be useful as a fertility inducing therapeutic, based upon the ability of activin 
molecules in stimulating FSH release from cells of the anterior pituitary. See, for example, 
U.S. Pat. No. 4,798,885. A polypeptide of the invention may also be useful for advancement 
of the onset of fertility in sexually immature mammals, so as to increase the lifetime 

15 reproductive performance of domestic animals such as, but not limited to, cows, sheep and 
pigs. 

The activity of a polypeptide of the invention may, among other means, be measured 
by the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: 
20 Vale et al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et 
al, Nature 321 :776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al, Proc. 
Natl. Acad. Sci. USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

25 A polypeptide of the present invention may be involved in chemotactic or 

chemokinetic activity for mammalian cells, including, for example, monocytes, fibroblasts, 
neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. 
Chemotactic and chemokinetic receptor activation can be used to mobilize or attract a 

30 desired cell population to a desired site of action. Chemotactic or chemokinetic compositions 
(e.g. proteins, antibodies, binding partners, or modulators of the invention) provide particular 
advantages in treatment of wounds and other trauma to tissues, as well as in treatment of 
localized infections. For example, attraction of lymphocytes, monocytes or neutrophils to 
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tumors or sites of infection may result in improved immune responses against the tumor or 
infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
stimulate, directly or indirectly, the directed orientation or movement of such cell 
5 population. Preferably, the protein or peptide has the ability to directly stimulate directed 
movement of cells. Whether a particular protein has chemotactic activity for a population of 
cells can be readily determined by employing such protein or peptide in any known assay for 
cell chemotaxis. 

Therapeutic compositions of the invention can be used in the following: 
10 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of 
cells across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. 
15 Coligan, A. M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene 

Publishing Associates and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta 
Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. 
APMIS 103: 140-146, 1995; Muller et al Eur. J. Immunol. 25: 1744-1748; Gruber et al. J. of 
Immunol. 152:5860-5867, 1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

20 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 
A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders 
25 (including hereditary disorders, such as hemophilias) or to enhance coagulation and other 
hemostatic events in treating wounds resulting from trauma, surgery or other causes. A 
composition of the invention may also be useful for dissolving or inhibiting formation of 
thromboses and for treatment and prevention of conditions resulting therefrom (such as, for 
example, infarction of cardiac and central nervous system vessels (e.g., stroke). 
30 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis 
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Res. 45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, 
Prostaglandins 35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

5 Polypeptides of the invention may be involved in cancer cell generation, proliferation 

or metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. 
For example, the presence or increased expression of a polynucleotide/polypeptide of the 
invention may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing 

10 malignancy. Conversely, a defect in the gene or absence of the polypeptide may be 
associated with a cancer condition. Identification of single nucleotide polymorphisms 
associated with cancer or a predisposition to cancer may also be useful for diagnosis or 
prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

1 5 inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor 
growth) and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. 
Therapeutic compositions of the invention may be effective in adult and pediatric oncology 
including in solid phase tumors/malignancies, locally advanced tumors, human soft tissue 
sarcomas, metastatic cancer, including lymphatic metastases, blood cell malignancies 

20 including multiple myeloma, acute and chronic leukemias, and lymphomas, head and neck 
cancers including mouth cancer, larynx cancer and thyroid cancer, lung cancers including 
small cell carcinoma and non-small cell cancers, breast cancers including small cell 
carcinoma and ductal carcinoma, gastrointestinal cancers including esophageal cancer, 
stomach cancer, colon cancer, colorectal cancer and polyps associated with colorectal 

25 neoplasia, pancreatic cancers, liver cancer, urologic cancers including bladder cancer and 

prostate cancer, malignancies of the female genital tract including ovarian carcinoma, uterine 
(including endometrial) cancers, and solid tumor in the ovarian follicle, kidney cancers 
including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

30 nervous system, bone cancers including osteomas, skin cancers including malignant 

melanoma, tumor progression of human skin keratinocytes, squamous cell carcinoma, basal 
cell carcinoma, hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention 
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(including inhibitors and stimulators of the biological activity of the polypeptide of the 
invention) may be administered to treat cancer. Therapeutic compositions can be 
administered in therapeutically effective dosages alone or in combination with adjuvant 
cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and laser 

5 therapy, and may provide a beneficial effect, e.g. reducing tumor size, slowing rate of tumor 
growth, inhibiting metastasis, or otherwise improving overall clinical condition, without 
necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 

10 modulator of the invention with one or more anti-cancer drugs in addition to a 

pharmaceutically acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer 
treatment is routine. Anti-cancer drugs that are well known in the art and can be used as a 
treatment in combination with the polypeptide or modulator of the invention include: 
Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan, Carboplatin, 

1 5 Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, Cytarabine HC1 

(Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HC1, Doxorubicin HC1, 
Estramustine phosphate sodium, Etoposide (VI 6-2 13), Floxuridine, 5-Fluorouracil (5-Fu), 
Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon Alpha-2a, Interferon 
Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), Lomustine, Mechlorethamine 

20 HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, Methotrexate (MTX), 

Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, Streptozocin, 
Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 
Semustine, Teniposide, and Vindesine sulfate. 

25 In addition, therapeutic compositions of the invention may be used for prophylactic 

treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing 
cancers. Under these circumstances, it may be beneficial to treat these individuals with 
therapeutically effective doses of the polypeptide of the invention to reduce the risk of 

30 developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays 
of cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) 
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Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 
and Ch 21), tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst, 
52: 921-30 (1974), mobility and invasive potential of tumor cells in Boyden Chamber assays 
as described in Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis 
5 assays such as induction of vascularization of the chick chorioallantoic membrane or 
induction of vascular endothelial cell migration as described in Ribatta et al., Intl. J. Dev. 
Biol, 40: 1 189-97 (1999) and Li et al, Clin. Exp. Metastasis, 17:423-9 (1999), respectively. 
Suitable tumor cells lines are available, e.g. from American Type Tissue Culture Collection 
catalogs. 

10 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of 
the invention can encode a polypeptide exhibiting such characteristics. Examples of such 

1 5 receptors and ligands include, without limitation, cytokine receptors and their ligands, 
receptor kinases and their ligands, receptor phosphatases and their ligands, receptors 
involved in cell-cell interactions and their ligands (including without limitation, cellular 
adhesion molecules (such as selectins, integrins and their ligands) and receptor/ligand pairs 
involved in antigen presentation, antigen recognition and development of cellular and 

20 humoral immune responses. Receptors and ligands are also useful for screening of potential 
peptide or small molecule inhibitors of the relevant receptor/ligand interaction. A protein of 
the present invention (including, without limitation, fragments of receptors and ligands) may 
themselves be useful as inhibitors of receptor/ligand interactions. 

The activity of a polypeptide of the invention may, among other means, be measured 

25 by the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- 
Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static conditions 

30 7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., 
J. Exp. Med. 168:1145-1156, 1988; Rosensteinet al., J. Exp. Med. 169:149-160 1989; 
Stoltenborg et al, J. Immunol. Methods 175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be 
identified through binding assays, affinity chromatography, dihybrid screening assays, 
BIAcore assays, gel overlay assays, or other methods known in the art. 
5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or 

a partial antagonist require the use of other proteins as competing ligands. The polypeptides 
of the present invention or ligand(s) thereof may be labeled by being coupled to 
radioisotopes, colorimetric molecules or a toxin molecules by conventional methods. 
("Guide to Protein Purification 11 Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 
10 (1990) Academic Press, Inc. San Diego). Examples of radioisotopes include, but are not 
limited to, tritium and carbon- 14 . Examples of colorimetric molecules include, but are not 
limited to, fluorescent molecules such as fluorescamine, or rhodamine or other colorimetric 
molecules. Examples of toxins include, but are not limited, to ricin. 



15 4,10,13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening 
techniques. The polypeptides or fragments employed in such a test may either be free in 
solution, affixed to a solid support, borne on a cell surface or located intracellularly. One 

20 method of drug screening utilizes eukaryotic or prokaryotic host cells which are stably 
transformed with recombinant nucleic acids expressing the polypeptide or a fragment 
thereof. Drugs are screened against such transformed cells in competitive binding assays. 
Such cells, either in viable or fixed form, can be used for standard binding assays. One may 
measure, for example, the formation of complexes between polypeptides of the invention or 

25 fragments and the agent being tested or examine the diminution in complex formation 

between the novel polypeptides and an appropriate cell line, which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate 
(i.e., increase or decrease) the activity of polypeptides of the invention include (1) inorganic 
and organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 
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The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or 
marine microorganisms or (2) extraction of the organisms themselves. Natural product 
5 libraries include polyketides, non-ribosomal peptides, and (non-naturally occurring) variants 
thereof For a review, see Science 252:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides 
or organic compounds and can be readily prepared by traditional automated synthesis 
methods, PCR, cloning or proprietary synthetic methods. Of particular interest are peptide 

10 and oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, 
protein, peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide 
libraries. For a review of combinatorial chemistry and libraries created therefrom, see 
Myers, Curr. Opin. Biotechnol 8:701-707 (1997). For reviews and examples of 
peptidomimetic libraries, see Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby 

15 et al, Curr Opin Chem Biol, 1(1):1 14-19 (1997); Dorner et al., BioorgMed Chem, 
4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein 
permits modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" 
to bind a polypeptide of the invention. The molecules identified in the binding assay are then 

20 tested for antagonist or agonist activity in in vivo tissue culture or animal models that are 
well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or 
animals and then tested for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The 

25 toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity of 
the binding molecule for a polypeptide of the invention. Alternatively, the binding 
molecules may be complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

30 The invention also provides methods to detect specific binding of a polypeptide e.g. a 

ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For 
example, expression cloning using mammalian or bacterial cells, or dihybrid screening 
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assays can be used to identify polynucleotides encoding binding partners. As another 
example, affinity chromatography with the appropriate immobilized polypeptide of the 
invention can be used to isolate polypeptides that recognize and bind polypeptides of the 
invention. There are a number of different libraries used for the identification of 
5 compounds, and in particular small molecules, that modulate (i.e., increase or decrease) 

biological activity of a polypeptide of the invention. Ligands for receptor polypeptides of the 
invention can also be identified by adding exogenous ligands, or cocktails of ligands to two 
cells populations that are genetically identical except for the expression of the receptor of the 
invention: one cell population expresses the receptor of the invention whereas the other does 

10 not. The responses of the two cell populations to the addition of ligands(s) are then 

compared. Alternatively, an expression library can be co-expressed with the polypeptide of 
the invention in cells and assayed for an autocrine response to identify potential ligand(s). As 
still another example, BIAcore assays, gel overlay assays, or other methods known in the art 
can be used to identify binding partner polypeptides, including, (1) organic and inorganic 

1 5 chemical libraries, (2) natural product libraries, and (3) combinatorial libraries comprised of 
random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of 
the polypeptide of the invention can be determined. For example, a chimeric protein in 
which the cytoplasmic domain of the polypeptide of the invention is fused to the 

20 extracellular portion of a protein, whose ligand has been identified, is produced in a host 
cell. The cell is then incubated with the ligand specific for the extracellular portion of the 
chimeric protein, thereby activating the chimeric receptor. Known downstream proteins 
involved in intracellular signaling can then be assayed for expected modifications i.e. 
phosphorylation. Other methods known to those in the art can also be used to identify 

25 signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. 
The anti-inflammatory activity may be achieved by providing a stimulus to cells involved in 
30 the inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for 
example, cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the 
t inflammatory process, inhibiting or promoting cell extravasation, or by stimulating or 

suppressing production of other factors which more directly inhibit or promote an 



WO 2004/080148 



PCT/US2003/030720 



66 

inflammatory response. Compositions with such activities can be used to treat inflammatory 
conditions including chronic or acute conditions), including without limitation intimation 
associated with infection (such as septic shock, sepsis or systemic inflammatory response 
syndrome (SIRS)), ischemia-reperfusion injury, endotoxin lethality, arthritis, 
5 complement-mediated hyperacute rejection, nephritis, cytokine or chemokine-induced lung 
injury, inflammatory bowel disease, Crohn's disease or resulting from over production of 
cytokines such as TNF or EL-1. Compositions of the invention may also be useful to treat 
anaphylaxis and hypersensitivity to an antigenic substance or material. Compositions of this 
invention may be utilized to prevent or treat conditions such as, but not limited to, sepsis, 

10 acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid arthritis, chronic 
inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, graft versus 
host disease, inflammatory bowel disease, inflamation associated with pulmonary disease, 
other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

1 5 intrauterine infections. 



4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of 
20 the invention. Such leukemias and related disorders include but are not limited to acute 
leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, 
promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic 
myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such 
disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

25 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
30 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient 
(including human and non-human mammalian patients) according to the invention include 
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but are not limited to the following lesions of either the central (including spinal cord, brain) 
or peripheral nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated 
with surgery, for example, lesions which sever a portion of the nervous system, or 

5 compression injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
10 injured as a result of infection, for example, by an abscess or associated with infection by 

human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme 
disease, tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration 

15 associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or 
amyotrophic lateral sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion of 
the nervous system is destroyed or injured by a nutritional disorder or disorder of 
metabolism including but not limited to, vitamin B12 deficiency, folic acid deficiency, 

20 Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary 
degeneration of the corpus callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not 
limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, 
carcinoma, or sarcoidosis; 

25 (vii) lesions caused by toxic substances including alcohol, lead, or particular 

neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is destroyed or 
injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various 
30 etiologies, progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 
system disorder may be selected by testing for biological activity in promoting the survival 
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or differentiation of neurons. For example, and not by way of limitation, therapeutics which 
elicit any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, 

e.g., choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method 

10 set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons 
may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or 
Brown et al. (1981, Ann, Rev. Neurosci. 4:17-42); increased production of 
neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody 
binding, Northern blot assay, etc., depending on the molecule to be measured; and motor 

1 5 neuron dysfunction may be measured by assessing the physical manifestation of motor 

neuron disorder, e.g., weakness, motor neuron conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to 
toxin, trauma, surgical damage, degenerative disease or malignancy that may affect motor 

20 neurons as well as other components of the nervous system, as well as disorders that 

selectively affect neurons such as amyotrophic lateral sclerosis, and including but not limited 
to progressive spinal muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, 
infantile and juvenile muscular atrophy, progressive bulbar paralysis of childhood (Fazio- 
Londe syndrome), poliomyelitis and the post polio syndrome, and Hereditary Motorsensory 

25 Neuropathy (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following 
additional activities or effects: inhibiting the growth, infection or function of, or killing, 
30 infectious agents, including, without limitation, bacteria, viruses, fungi and other parasites; 
effecting (suppressing or enhancing) bodily characteristics, including, without limitation, 
height, weight, hair color, eye color, skin, fat to lean ratio or other tissue pigmentation, or 
organ or body part size or shape (such as, for example, breast augmentation or diminution, 
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change in bone form or shape); effecting biorhythms or circadian cycles or rhythms; 
effecting the fertility of male or female subjects; effecting the metabolism, catabolism, 
anabolism, processing, utilization, storage or elimination of dietary fat, lipid, protein, 
carbohydrate, vitamins, minerals, co-factors or other nutritional factors or components); 
5 effecting behavioral characteristics, including, without limitation, appetite, libido, stress, 
cognition (including cognitive disorders), depression (including depressive disorders) and 
violent behaviors; providing analgesic effects or other pain reducing effects; promoting 
differentiation and growth of embryonic stem cells in lineages other than hematopoietic 
lineages; hormonal or endocrine activity; in the case of en2ymes, correcting deficiencies of 
10 the enzyme and treating deficiency-related diseases; treatment of hyperproliferative 
disorders (such as, for example, psoriasis); immunoglobulin-like activity (such as, for 
example, the ability to bind antigens or complement); and the ability to act as an antigen in a 
vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

15 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for 
diagnosis and treatment. Such polymorphisms may be associated with, e.g., differential 

20 predisposition or susceptibility to various disease states (such as disorders involving 

inflammation or immune response) or a differential response to drug administration, and this 
genetic information can be used to tailor preventive or therapeutic treatment appropriately. 
For example, the existence of a polymorphism associated with a predisposition to 
inflammation or autoimmune disease makes possible the diagnosis of this condition in 

25 humans by identifying the presence of the polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, 
optionally involving isolation or amplification of the DNA, and identifying the presence of 
the polymorphism in the DNA. For example, PCR may be used to amplify an appropriate 

30 fragment of genomic DNA which may then be sequenced. Alternatively, the DNA may be 
subjected to allele-specific oligonucleotide hybridization (in which appropriate 
oligonucleotides are hybridized to the DNA under conditions permitting detection of a single 
base mismatch) or to a single nucleotide extension assay (in which an oligonucleotide that 
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hybridizes immediately adjacent to the position of the polymorphism is extended with one or 
more labeled nucleotides). In addition, traditional restriction fragment length polymorphism 
analysis (using restriction enzymes that provide differential digestion of the genomic DNA 
depending on the presence or absence of the polymorphism) may be performed. Arrays with 
5 nucleotide sequences of the present invention can be used to detect polymorphisms. The 
array can comprise modified nucleotide sequences of the present invention in order to detect 
the nucleotide sequences of the present invention. In the alternative, any one of the 
nucleotide sequences of the present invention can be placed on the array to detect changes 
from those sequences. 

10 Alternatively a polymorphism resulting in a change in the amino acid sequence could 

also be detected by detecting a corresponding change in amino acid sequence of the protein, 
e.g., by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

1 5 The immunosuppressive effects of the compositions of the invention against 

rheumatoid arthritis is determined in an experimental animal model system. The 
experimental model system is adjuvant induced arthritis in rats, and the protocol is described 
by J. Holoshitz, et at, 1983, Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. 
Allergy Appl. Immunol., 23:129. Induction of the disease can be caused by a single 

20 injection, generally intradermally, of a suspension of killed Mycobacterium tuberculosis in 
complete Freund's adjuvant (CFA). The route of injection can vary, but rats may be injected 
at the base of the tail with an adjuvant mixture. The polypeptide is administered in phosphate 
buffered solution (PBS) at a dose of about 1-5 mg/kg. The control consists of administering 
PBS only. 

25 The procedure for testing the effects of the test compound would consist of 

intradermally injecting killed Mycobacterium tuberculosis in CFA followed by immediately 
administering the test compound and subsequent treatment every other day until day 24. At 
14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an overall arthritis 
score may be obtained as described by J. Holoskitz above. An analysis of the data would 

30 reveal that the test compound would have a dramatic affect on the swelling of the joints as 
measured by a decrease of the arthritis score. 



4.11 THERAPEUTIC METHODS 
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The compositions (including polypeptide fragments, analogs, variants and antibodies 
or other binding partners or modulators including antisense polynucleotides) of the invention 
have numerous applications in a variety of therapeutic methods. Examples of therapeutic 
applications include, but are not limited to, those exemplified herein. 

5 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode 

10 of administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, 
weight, condition and response of the individual patient. Typically, the amount of 

1 5 polypeptide administered per dose will be in the range of about 0.01)ag/kg to 100 mg/kg of 
body weight, with the preferred dose being about 0.1fig/kg to 10 mg/kg of patient body 
weight. For parenteral administration, polypeptides of the invention will be formulated in an 
injectable form combined with a pharmaceutical^ acceptable parenteral vehicle. Such 
vehicles are well known in the art and examples include water, saline, Ringers solution, 

20 dextrose solution, and solutions consisting of small amounts of the human serum albumin. 
The vehicle may contain minor amounts of additives that maintain the isotonicity and 
stability of the polypeptide or other active ingredient. The preparation of such solutions is 
within the skill of the art. 

25 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 

ADMINISTRATION 

A protein or other composition of the present invention (from whatever source 
derived, including without limitation from recombinant and non-recombinant sources and 
including antibodies and other binding partners of the polypeptides of the invention) may be 
30 administered to a patient in need, by itself, or in pharmaceutical compositions where it is 
mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a variety of 
disorders. Such a composition may optionally contain (in addition to protein or other active 
ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubilizers, and other 
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materials well known in the art. The term "pharmaceutically acceptable" means a non-toxic 
material that does not interfere with the effectiveness of the biological activity of the active 
ingredient(s). The characteristics of the carrier will depend on the route of administration. 
The pharmaceutical composition of the invention may also contain cytokines, lymphokines, 
5 or other hematopoietic factors such as M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, 
IL-6, IL-7, IL-8, IL-9, DL-10, IL-11, IL-12, IL-13, IL-14, IL-15, IFN, TNFO, TNF1, TNF2, 
G-CSF, Meg-CSF, thrombopoietin, stem cell factor, and erythropoietin. In further 
compositions, proteins of the invention may be combined with other agents beneficial to the 
treatment of the disease or disorder in question. These agents include various growth factors 
10 such as epidermal growth factor (EGF), platelet-derived growth factor (PDGF), transforming 
growth factors (TGF-ot and TGF-P), insulin-like growth factor (IGF), as well as cytokines 
described herein. 

The pharmaceutical composition may further contain other agents which either 
enhance the activity of the protein or other active ingredient or complement its activity or 

15 use in treatment. Such additional factors and/or agents may be included in the 

pharmaceutical composition to produce a synergistic effect with protein or other active 
ingredient of the invention, or to minimize side effects. Conversely, protein or other active 
ingredient of the present invention may be included in formulations of the particular clotting 
factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic 

20 factor, or anti- inflammatory agent to minimize side effects of the clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or 
anti-inflammatory agent (such as IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, 
immunosuppressive agents). A protein of the present invention may be active in multimers 
(e.g., heterodimers or homodimers) or complexes with itself or other proteins. As a result, 

25 pharmaceutical compositions of the invention may comprise a protein of the invention in 
such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 

30 therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application 
may be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, 
latest edition. A therapeutically effective dose further refers to that amount of the compound 
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sufficient to result in amelioration of symptoms, e.g., treatment, healing, prevention or 
amelioration of the relevant medical condition, or an increase in rate of treatment, healing, 
prevention or amelioration of such conditions. When applied to an individual active 
ingredient, administered alone, a therapeutically effective dose refers to that ingredient 
5 alone. When applied to a combination, a therapeutically effective dose refers to combined 
amounts of the active ingredients that result in the therapeutic effect, whether administered 
in combination, serially or simultaneously. 

In practicing the method of treatment or use of the present invention, a 
therapeutically effective amount of protein or other active ingredient of the present invention 

10 is administered to a mammal having a condition to be treated. Protein or other active 

ingredient of the present invention may be administered in accordance with the method of 
the invention either alone or in combination with other therapies such as treatments 
employing cytokines, lymphokines or other hematopoietic factors. When co- administered 
with one or more cytokines, lymphokines or other hematopoietic factors, protein or other 

1 5 active ingredient of the present invention may be administered either simultaneously with 
the cytokine(s), lymphokine(s), other hematopoietic factor(s), thrombolytic or 
anti-thrombotic factors, or sequentially. If administered sequentially, the attending physician 
will decide on the appropriate sequence of administering protein or other active ingredient of 
the present invention in combination with cytokine(s), lymphokine(s), other hematopoietic 

20 factor(s), thrombolytic or anti-thrombotic factors. 



4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, 
transmucosal, or intestinal administration; parenteral delivery, including intramuscular, 

25 subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, 

intravenous, intraperitoneal, intranasal, or intraocular injections. Administration of protein 
or other active ingredient of the present invention used in the pharmaceutical composition or 
to practice the method of the present invention can be carried out in a variety of conventional 
ways, such as oral ingestion, inhalation, topical application or cutaneous, subcutaneous, 

30 intraperitoneal, parenteral or intravenous injection. Intravenous administration to the patient 
is preferred. 

Alternately, one may administer the compound in a local rather than systemic 
manner, for example, via injection of the compound directly into a arthritic joints or in 
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fibrotic tissue, often in a depot or sustained release formulation. In order to prevent the 
scarring process frequently occurring as complication of glaucoma surgery, the compounds 
may be administered topically, for example, as eye drops. Furthermore, one may administer 
the drug in a targeted drug delivery system, for example, in a liposome coated with a specific 
5 antibody, targeting, for example, arthritic or fibrotic tissue. The liposomes will be targeted 
to and taken up selectively by the afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an 
effective dosage to the desired site of action. The determination of a suitable route of 
administration and an effective dosage for a particular indication is within the level of skill 
10 in the art. Preferably for wound treatment, one administers the therapeutic compound 
directly to the site. Suitable dosage ranges for the polypeptides of the invention can be 
extrapolated from these dosages or from similar studies in appropriate animal models. 
Dosages can then be adjusted as necessary by the clinician to provide maximal therapeutic 
benefit. 

15 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus 
may be formulated in a conventional manner using one or more physiologically acceptable 
carriers comprising excipients and auxiliaries which facilitate processing of the active 

20 compounds into preparations which can be used pharmaceutically. These pharmaceutical 
compositions may be manufactured in a manner that is itself known, eg., by means of 
conventional mixing, dissolving, granulating, dragee-making, levigating, emulsifying, 
encapsulating, entrapping or lyophilizing processes. Proper formulation is dependent upon 
the route of administration chosen. When a therapeutically effective amount of protein or 

25 other active ingredient of the present invention is administered orally, protein or other active 
ingredient of the present invention will be in the form of a tablet, capsule, powder, solution 
or elixir. When administered in tablet form, the pharmaceutical composition of the invention 
may additionally contain a solid carrier such as a gelatin or an adjuvant. The tablet, capsule, 
and powder contain from about 5 to 95% protein or other active ingredient of the present 

30 invention, and preferably from about 25 to 90% protein or other active ingredient of the 
present invention. When administered in liquid form, a liquid carrier such as water, 
petroleum, oils of animal or plant origin such as peanut oil, mineral oil, soybean oil, or 
sesame oil, or synthetic oils may be added. The liquid form of the pharmaceutical 
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composition may further contain physiological saline solution, dextrose or other saccharide 
solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. When 
administered in liquid form, the pharmaceutical composition contains from about 0.5 to 90% 
by weight of protein or other active ingredient of the present invention, and preferably from 
5 about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, 
protein or other active ingredient of the present invention will be in the form of a 
pyrogen-free, parenterally acceptable aqueous solution. The preparation of such parenterally 

10 acceptable protein or other active ingredient solutions, having due regard to pH, isotonicity, 
stability, and the like, is within the skill in the art. A preferred pharmaceutical composition 
for intravenous, cutaneous, or subcutaneous injection should contain, in addition to protein 
or other active ingredient of the present invention, an isotonic vehicle such as Sodium 
Chloride Injection, Ringer's Injection, Dextrose Injection, Dextrose and Sodium Chloride 

1 5 Injection, Lactated Ringer's Injection, or other vehicle as known in the art. The 
pharmaceutical composition of the present invention may also contain stabilizers, 
preservatives, buffers, antioxidants, or other additives known to those of skill in the art. For 
injection, the agents of the invention may be formulated in aqueous solutions, preferably in 
physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 

20 physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the fonnulation. Such penetrants are generally known in 
the art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutically acceptable carriers well known in the art. Such 

25 carriers enable the compounds of the invention to be formulated as tablets, pills, dragees, 
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a 
patient to be treated. Pharmaceutical preparations for oral use can be obtained from a solid 
excipient, optionally grinding a resulting mixture, and processing the mixture of granules, 
after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 

30 excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or 
sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, 
potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, 
sodium carboxymethylcelluiose, and/or polyvinylpyrrolidone (PVP). If desired, 
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disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or 
alginic acid or a salt thereof such as sodium alginate. Dragee cores are provided with 
suitable coatings. For this purpose, concentrated sugar solutions may be used, which may 
optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, 
5 and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
Dyestuffs or pigments may be added to the tablets or dragee coatings for identification or to 
characterize different combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made 
of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol 

10 or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler 
such as lactose, binders such as starches, and/or lubricants such as talc or magnesium 
stearate and, optionally, stabilizers. In soft capsules, the active compounds may be dissolved 
or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene 
glycols. In addition, stabilizers may be added. All formulations for oral administration 

1 5 should be in dosages suitable for such administration. For buccal administration, the 

compositions may take the form of tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 

20 dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide 
or other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined 
by providing a valve to deliver a metered amount. Capsules and cartridges of, gelatin 
for use in an inhaler or insufflator may be formulated containing a powder mix of the 
compound and a suitable powder base such as lactose or starch. The compounds may be 

25 formulated for parenteral administration by injection, e.g., by bolus injection or continuous 
infusion. Formulations for injection may be presented in unit dosage form, e.g., in ampules 
or in multi-dose containers, with an added preservative. The compositions may take such 
forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and may contain 
formulatory agents such as suspending, stabilizing and/or dispersing agents. 

30 Pharmaceutical formulations for parenteral administration include aqueous solutions 

of the active compounds in water-soluble form. Additionally, suspensions of the active 
compounds may be prepared as appropriate oily injection suspensions. Suitable lipophilic 
solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such 
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as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions may contain 
substances which increase the viscosity of the suspension, such as sodium carboxymethyl 
cellulose, sorbitol, or dextran. Optionally, the suspension may also contain suitable 
stabilizers or agents which increase the solubility of the compounds to allow for the 
5 preparation of highly concentrated solutions. Alternatively, the active ingredient may be in 
powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before 
use. 

The compounds may also be formulated in rectal compositions such as suppositories 
' or retention enemas, e.g., containing conventional suppository bases such as cocoa butter or 

10 other glycerides. In addition to the formulations described previously, the compounds may 
also be formulated as a depot preparation. Such long acting formulations may be 
administered by implantation (for example subcutaneously or intramuscularly) or by 
intramuscular injection. Thus, for example, the compounds may be formulated with suitable 
polymeric or hydrophobic materials (for example as an emulsion in an acceptable oil) or ion 

15 exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co- 
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic 
polymer, and an aqueous phase. The co-solvent system may be the VPD co-solvent system. 
VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 

20 80, and 65% w/v polyethylene glycol 300, made up to volume in absolute ethanol. The VPD 
co-solvent system (VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water 
solution. This co-solvent system dissolves hydrophobic compounds well, and itself produces 
low toxicity upon systemic administration. Naturally, the proportions of a co-solvent system 
may be varied considerably without destroying its solubility and toxicity characteristics. 

25 Furthermore, the identity of the co-solvent components may be varied: for example, other 
low-toxicity nonpolar surfactants may be used instead of polysorbate 80; the fraction size of 
polyethylene glycol may be varied; other biocompatible polymers may replace polyethylene 
glycol, e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for 
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical compounds 

30 may be employed. Liposomes and emulsions are well known examples of delivery vehicles 
or carriers for hydrophobic drugs. Certain organic solvents such as dimethylsulfoxide also 
may be employed, although usually at the cost of greater toxicity. Additionally, the 
compounds may be delivered using a sustained-release system, such as semipermeable 
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matrices of solid hydrophobic polymers containing the therapeutic agent. Various types of 
sustained-release materials have been established and are well known by those skilled in the 
art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
5 biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase 
carriers or excipients. Examples of such carriers or excipients include but are not limited to 
calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, 

10 • gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the 
invention may be provided as salts with pharmaceutically compatible counter ions. Such 
pharmaceutically acceptable base addition salts are those salts which retain the biological 
effectiveness and properties of the free acids and which are obtained by reaction with 
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, 

15 trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium acetate, 
potassium benzoate, triefhanol amine and the like. 

The pharmaceutical composition of the invention may be in the form of a complex of 
the protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 

20 lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) 
following presentation of the antigen by MHC proteins. MHC and structurally related 
proteins including those encoded by class I and class II MHC genes on host cells will serve 
to present the peptide antigen(s) to T lymphocytes. The antigen components could also be 

25 supplied as purified MHC-peptide complexes alone or with co-stimulatory molecules that 
can directly signal T cells. Alternatively antibodies able to bind surface immunoglobulin 
and other molecules on B cells as well as antibodies able to bind the TCR and other 
molecules on T cells can be combined with the pharmaceutical composition of the invention. 
The pharmaceutical composition of the invention may be in the form of a liposome in 

30 which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. 
Suitable lipids for liposomal formulation include, without limitation, monoglycerides, 
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diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. 
Preparation of such liposomal formulations is within the level of skill in the art, as disclosed, 
for example, in U.S. Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of 
which are incorporated herein by reference. 
5 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and 
severity of the condition being treated, and on the nature of prior treatments which the 
patient has undergone. Ultimately, the attending physician will decide the amount of protein 
or other active ingredient of the present invention with which to treat each individual patient. 

10 Initially, the attending physician will administer low doses of protein or other active 
ingredient of the present invention and observe the patient's response. Larger doses of 
protein or other active ingredient of the present invention may be administered until the 
optimal therapeutic effect is obtained for the patient, and at that point the dosage is not 
increased further. It is contemplated that the various pharmaceutical compositions used to 

15 practice the method of the present invention should contain about 0.01 fig to about 100 mg 
(preferably about 0.1 fig to about 10 mg, more preferably about 0.1 fig to about 1 mg) of 
protein or other active ingredient of the present invention per kg body weight For 
compositions of the present invention which are useful for bone, cartilage, tendon or 
ligament regeneration, the therapeutic method includes administering the composition 

20 topically, systematically, or locally as an implant or device. When administered, the 
therapeutic composition for use in this invention is, of course, in a pyrogen-free, 
physiologically acceptable form. Further, the composition may desirably be encapsulated or 
injected in a viscous form for delivery to the site of bone, cartilage or tissue damage. 
Topical administration may be suitable for wound healing and tissue repair. Therapeutically 

25 useful agents other than a protein or other active ingredient of the invention which may also 
optionally be included in the composition as described above, may alternatively or 
additionally, be administered simultaneously or sequentially with the composition in the 
methods of the invention. Preferably for bone and/or cartilage formation, the composition 
would include a matrix capable of delivering the protein-containing or other active 

30 ingredient-containing composition to the site of bone and/or cartilage damage, providing a 
structure for the developing bone and cartilage and optimally capable of being resorbed into 
the body. Such matrices may be formed of materials presently in use for other implanted 
medical applications. 
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The choice of matrix material is based on biocompatibility, biodegradability, 
mechanical properties, cosmetic appearance and interface properties. The particular 
application of the compositions will define the appropriate formulation. Potential matrices 
for the compositions may be biodegradable and chemically defined calcium sulfate, 
5 tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. 
Other potential materials are biodegradable and biologically well-defined, such as bone or 
dermal collagen. Further matrices are comprised of pure proteins or extracellular matrix 
components. Other potential matrices are nonbiodegradable and chemically defined, such as 
sintered hydroxyapatite, bioglass, aluminates, or other ceramics. Matrices may be comprised 

10 of combinations of any of the above-mentioned types of material, such as polylactic acid and 
hydroxyapatite or collagen and tricalcium phosphate. The bioceramics may be altered in 
composition, such as in calcium-aluminate-phosphate and processing to alter pore size, 
particle size, particle shape, and biodegradability. Presently preferred is a 50:50 (mole 
weight) copolymer of lactic acid and glycolic acid in the form of porous particles having 

15 diameters ranging from 150 to 800 microns. In some applications, it will be useful to utilize 
a sequestering agent, such as carboxymethyl cellulose or autologous blood clot, to prevent 
the protein compositions from disassociating from the matrix. 

A preferred family of sequestering agents is cellulosic materials such as 
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, 

20 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, 

hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred being 
cationic salts of carboxymethylcellulose (CMC). Other preferred sequestering agents 
include hyaluronic acid, sodium alginate, polyethylene glycol), polyoxyethylene oxide, 
carboxyvinyl polymer and poly(vinyI alcohol). The amount of sequestering agent useful 

25 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which 
represents the amount necessary to prevent desorption of the protein from the polymer 
matrix and to provide appropriate handling of the composition, yet not so much that the 
progenitor cells are prevented from infiltrating the matrix, thereby providing the protein the 
opportunity to assist the osteogenic activity of the progenitor cells. In further compositions, 

30 proteins or other active ingredients of the invention may be combined with other agents 
beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in question. 
These agents include various growth factors such as epidermal growth factor (EGF), platelet 
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derived growth factor (PDGF), transforming growth factors (TGF-cc and TGF-p), and 
insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 

5 patients for such treatment with proteins or other active ingredients of the present invention. 
The dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
modify the action of the proteins, &g., amount of tissue weight desired to be formed, the site 
of damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue 

10 (e.g., bone), the patient's age, sex, and diet, the severity of any infection, time of 

administration and other clinical factors. The dosage may vary with the type of matrix used 
in the reconstitution and with inclusion of other proteins in the pharmaceutical composition. 
For example, the addition of other known growth factors, such as IGF I (insulin like growth 
factor I), to the final composition, may also effect the dosage. Progress can be monitored by 

1 5 periodic assessment of tissue/bone growth and/or repair, for example, X-rays, 
histomorphometric determinations and tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other 

20 known methods for introduction of nucleic acid into a cell or organism (including, without 
limitation, in the form of viral vectors or naked DMA). Cells may also be cultured ex vivo in 
the presence of proteins of the present invention in order to proliferate or to produce a 
desired effect on or activity in such cells. Treated cells can then be introduced in vivo for 
therapeutic purposes. 

25 

4,12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve 
its intended purpose. More specifically, a therapeutically effective amount means an amount 
30 effective to prevent development of or to alleviate the existing symptoms of the subject 
being treated. Deteimination of the effective amount is well within the capability of those 
skilled in the art, especially in light of the detailed disclosure provided herein. For any 
compound used in the method of the invention, the therapeutically effective dose can be 
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estimated initially from appropriate in vitro assays. For example, a dose can be formulated in 
animal models to achieve a circulating concentration range that can be used to more 
accurately determine useful doses in humans. For example, a dose can be formulated in 
animal models to achieve a circulating concentration range that includes the IC50 as 

5 determined in cell culture (i.e., the concentration of the test compound which achieves a 
half-maximal inhibition of the protein's biological activity). Such information can be used 
to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 

1 0 efficacy of such compounds can be determined by standard pharmaceutical procedures in 
cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% 
of the population) and the ED50 (the dose therapeutically effective in 50% of the population). 
The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be 
expressed as the ratio between LD50 and ED 50 . Compounds which exhibit high therapeutic 

15 indices are preferred. The data obtained from these cell culture assays and animal studies 
can be used in formulating a range of dosage for use in human. The dosage of such 
compounds lies preferably within a range of circulating concentrations that include the ED50 
with little or no toxicity. The dosage may vary within this range depending upon the dosage 
form employed and the route of administration utilized. The exact formulation, route of 

20 administration and dosage can be chosen by the individual physician in view of the patient's 
condition. See, e.g., Fingl et al., 1975, in "The Pharmacological Basis of Therapeutics", Ch. 
1 p.l. Dosage amount and interval may be adjusted individually to provide plasma levels of 
the active moiety which are sufficient to maintain the desired effects, or minimal effective 
concentration (MEC). The MEC will vary for each compound but can be estimated from in 

25 vitro data. Dosages necessary to achieve the MEC will depend on individual characteristics 
and route of administration. However, HPLC assays or bioassays can be used to determine 
plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of 

30 the time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
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An exemplary dosage regimen for polypeptides or other compositions of the 
invention will be in the range of about 0.01 jig/kg to 100 mg/kg of body weight daily, with 
the preferred dose being about 0.1 jag/kg to 25 mg/kg of patient body weight daily, varying 
in adults and children. Dosing may be once daily, or equivalent doses may be delivered at 
5 longer or shorter intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subjects age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

10 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which 
may contain one or more unit dosage forms containing the active ingredient. The pack may, 
for example, comprise metal or plastic foil, such as a blister pack The pack or dispenser 
device may be accompanied by instructions for administration. Compositions comprising a 
1 5 compound of the invention formulated in a compatible pharmaceutical carrier may also be 
prepared, placed in an appropriate container, and labeled for treatment of an indicated 
condition. 

4.13 ANTIBODIES 

20 Also included in the invention are antibodies to proteins, or fragments of proteins of 

the invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that 
contain an antigen-binding site that specifically binds (immunoreacts with) an antigen. Such 
antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, 

25 F ab , Fab' and F (a b«)2 fragments, and an F a b expression library. In general, an antibody molecule 
obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ 
from one another by the nature of the heavy chain present in the molecule. Certain classes 
have subclasses as well, such as IgGi, IgG 2 , and others. Furthermore, in humans, the light 
chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a 

30 reference to all such classes, subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or 
a portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for 
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polyclonal and monoclonal antibody preparation. The fiill-length protein can be used or, 
alternatively, the invention provides antigenic peptide fragments of the antigen for use as 
immunogens. An antigenic peptide fragment comprises at least 6 amino acid residues of the 
amino acid sequence of the full length protein, such as an amino acid sequence shown in 

5 SEQ ID NO: 685-1368, or 1967-2564, or Tables 3 A, 3B, 5, 7, or 8, and encompasses an 
epitope thereof such that an antibody raised against the peptide forms a specific immune 
complex with the full length protein or with any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 
amino acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. 

10 Preferred epitopes encompassed by the antigenic peptide are regions of the protein that are 
located on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a surface region of the protein, e.g., a hydrophilic region. A 
hydrophobicity analysis of the human related protein sequence will indicate which regions of 

15 a related protein are particularly hydrophilic and, therefore, are likely to encode surface 
residues useful for targeting antibody production. As a means for targeting antibody 
production, hydropathy plots showing regions of hydrophilicity and hydrophobicity may be 
generated by any method well known in the art, including, for example, the Kyte Doolittle or 
the Hopp Woods methods, either with or without Fourier transformation. See, e.g., Hopp and 

20 Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. Mol. 
Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or 
derivatives, fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

25 thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

The term "specific for" indicates that the variable regions of the antibodies of the 
invention recognize and bind polypeptides of the invention exclusively (i.e., able to 
distinguish the polypeptide of the invention from other similar polypeptides despite sequence 

30 identity, homology, or similarity found in the family of polypeptides), but may also interact 
with other proteins (for example, S. aureus protein A or other antibodies in ELISA 
techniques) through interactions with sequences outside the variable region of the antibodies, 
and in particular, in the constant region of the molecule. Screening assays to determine 
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binding specificity of an antibody of the invention are well known and routinely practiced in 
the art. For a comprehensive discussion of such assays, see Harlow et al. (Eds), Antibodies 
A Laboratory Manual; Cold Spring Harbor Laboratory; Cold Spring Harbor, NY (1988), 
Chapter 6. Antibodies that recognize and bind fragments of the polypeptides of the 
5 invention are also contemplated, provided that the antibodies are first and foremost specific 
for, as defined above, full-length polypeptides of the invention. As with antibodies that are 
specific for full length polypeptides of the invention, antibodies of the invention that 
recognize fragments are those which can distinguish polypeptides from the same family of 
polypeptides despite inherent sequence identity, homology, or similarity found in the family 
10 of proteins. 

Antibodies of the invention are useful for, for example, therapeutic purposes (by 
modulating activity of a polypeptide of the invention), diagnostic purposes to detect or 
quantitate a polypeptide of the invention, as well as purification of a polypeptide of the 
invention. Kits comprising an antibody of the invention for any of the purposes described 

1 5 herein are also comprehended. In general, a kit of the invention also includes a control 
antigen for which the antibody is immunospecific. The invention further provides a 
hybridoma that produces an antibody according to the invention. Antibodies of the 
invention are useful for detection and/or purification of the polypeptides of the invention. 
Monoclonal antibodies binding to the protein of the invention may be useful 

20 diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal 
antibodies binding to the protein may also be useful therapeutics for both conditions 
associated with the protein and also in the treatment of some forms of cancer where 
abnormal expression of the protein is involved. In the case of cancerous cells or leukemic 
cells, neutralizing monoclonal antibodies against the protein may be useful in detecting and 

25 preventing the metastatic spread of the cancerous cells, which may be mediated by the 
protein. 

The labeled antibodies of the present invention can be used for in vitro, in vivo, and 
in situ assays to identify cells or tissues in which a fragment of the polypeptide of interest is 
expressed. The antibodies may also be used directly in therapies or other diagnostics. The 
30 present invention further provides the above-described antibodies immobilized on a solid 
support. Examples of such solid supports include plastics such as polycarbonate, complex 
carbohydrates such as agarose and Sepharose®, acrylic resins and such as polyacrylamide 
and latex beads. Techniques for coupling antibodies to such solid supports are well known 
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in the art (Weir, D.M. et al., "Handbook of Experimental Immunology" 4th Ed., Blackwell 
Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W.D. et al, Meth. 
Enzym. 34 Academic Press, N.Y. (1974)). The immobilized antibodies of the present 
invention can be used for in vitro, in vivo, and in situ assays as well as for immuno-affinity 

5 purification of the proteins of the present invention. 

Various procedures known within the art may be used for the production of 
polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: 
A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, 

10 Cold Spring Harbor, NY, incorporated herein by reference). Some of these antibodies are 
discussed below. 



4.13-1 POLYCLONAL ANTIBODIES 

For the production of polyclonal antibodies, various suitable host animals (e.g., 

1 5 rabbit, goat, mouse or other mammal) may be immunized by one or more injections with the 
native protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated 

20 to a second protein known to be immunogenic in the mammal being immunized. Examples 
of such immunogenic proteins include but are not limited to keyhole limpet hemocyanin, 
serum albumin, bovine thyreoglobulin, and soybean trypsin inhibitor. The preparation can 
further include an adjuvant. Various adjuvants used to increase the immunological response 
include, but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., 

25 aluminum hydroxide), surface-active substances (e.g., lysolecithin, pluronic polyols, 

polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in humans such as 
Bacille Calmette-Guerin and Corynebacterium parvum, or similar immunostimulatory 
agents. Additional examples of adjuvants that can be employed include MPL-TDM adjuvant 
(monophosphoryl Lipid A, synthetic trehalose dicorynomycolate). 

30 The polyclonal antibody molecules directed against the immunogenic protein can be 

isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as affinity chromatography using protein A or protein G, which provide 
primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 
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antigen which is the target of the immunoglobulin sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by immunoaffinity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 
5 (April 17, 2000), pp. 25-28). 

4.13.2 MONOCLONAL ANTIBODIES 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as 
used herein, refers to a population of antibody molecules that contain only one molecular 

10 species of antibody molecule consisting of a unique light chain gene product and a unique 
heavy chain gene product In particular, the complementarity determining regions (CDRs) 
of the monoclonal antibody are identical in all the molecules of the population. MAbs thus 
contain an antigen-binding site capable of immunoreacting with a particular epitope of the 
antigen characterized by a unique binding affinity for it. 

1 5 Monoclonal antibodies can be prepared using hybridoma methods, such as those 

described by Kohler and Milstein, Nature, 256, 495 (1975). In a hybridoma method, a 
mouse, hamster, or other appropriate host animal, is typically immunized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies 
that will specifically bind to the immunizing agent. Alternatively, the lymphocytes can be 

20 immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells 
of human origin are desired, or spleen cells or lymph node cells are used if non-human 
mammalian sources are desired. The lymphocytes are then fused with an immortalized cell 

25 line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell 
(Goding, Monoclonal Antibodies: Principles and Practice, Academic Press, (1986) pp. 59- 
103). Immortalized cell lines are usually transformed mammalian cells, particularly 
myeloma cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell 
lines are employed. The hybridoma cells can be cultured in a suitable culture medium that 

30 preferably contains one or more substances that inhibit the growth or survival of the unfused, 
immortalized cells. For example, if the parental cells lack the enzyme hypoxanthine guanine 
phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridomas 
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typically will include hypoxanthine, aminopterin, and thymidine ("HAT medium"), which 
substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high 
level expression of antibody by the selected antibody-producing cells, and are sensitive to a 

5 medium such as HAT medium. More preferred immortalized cell lines are murine myeloma 
lines, which can be obtained, for instance, from the Salk Institute Cell Distribution Center, 
San Diego, California and the American Type Culture Collection, Manassas, Virginia. 
Human myeloma and mouse-human heteromyeloma cell lines also have been described for 
the production of human monoclonal antibodies (Kozbor, J. Immunol, 133:3001 (1984); 

10 Brodeur et al., Monoclonal Antibody Production Techniques and Applications, Marcel 
Dekker, Inc., New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed 
for the presence of monoclonal antibodies directed against the antigen. Preferably, the 
binding specificity of monoclonal antibodies produced by the hybridoma cells is determined 

1 5 by immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in 
the art. The binding affinity of the monoclonal antibody can, for example, be determined by 
the Scatchard analysis of Munson and Pollard, Anal. Biochem., 107, 220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target 

20 antigen are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods. Suitable culture media for this 
purpose include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 
medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 

25 The monoclonal antibodies secreted by the subclones can be isolated or purified from 

the culture medium or ascites fluid by conventional immunoglobulin purification procedures 
such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel 
electrophoresis, dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 

30 those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of 
the invention can be readily isolated and sequenced using conventional procedures (e.g., by 
using oligonucleotide probes that are capable of binding specifically to genes encoding the ' 
heavy and light chains of murine antibodies). The hybridoma cells of the invention serve as 
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a preferred source of such DNA. Once isolated, the DNA can be placed into expression 
vectors, which are then transfected into host cells such as simian COS cells, Chinese hamster 
ovary (CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, 
to obtain the synthesis of monoclonal antibodies in the recombinant host cells. The DNA 

5 also can be modified, for example, by substituting the coding sequence for human heavy and 
light chain constant domains in place of the homologous murine sequences (U.S. Patent No. 
4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to the 
immunoglobulin coding sequence all or part of the coding sequence for a non- 
immunoglobulin polypeptide. Such a non-immunoglobulin polypeptide can be substituted 

1 0 for the constant domains of an antibody of the invention, or can be substituted for the 

variable domains of one antigen-combining site of an antibody of the invention to create a 
chimeric bivalent antibody. 

4.13.3 HUMANIZED ANTIBODIES 

1 5 The antibodies directed against the protein antigens of the invention can further 

comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against 
the administered immunoglobulin. Humanized forms of antibodies are chimeric 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , 

20 F(ab') 2 or other antigen-binding subsequences of antibodies) that are principally comprised 
of the sequence of a human immunoglobulin, and contain minimal sequence derived from a 
non-human immunoglobulin. Humanization can be performed following the method of 
Winter and co-workers (Jones et al., Nature, 321, 522-525 (1986); Riechmann et aL, Nature, 
332, 323-327 (1988); Verhoeyen et al., Science, 239, 1534-1536 (1988)), by substituting 

25 rodent CDRs or CDR sequences for the corresponding sequences of a human antibody. (See 
also U.S. Patent No. 5,225,539). In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
can also comprise residues that are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, the humanized antibody will comprise 

30 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework regions are those of a human immunoglobulin 
consensus sequence. The humanized antibody optimally also will comprise at least a portion 
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of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin 
(Jones et al., 1986; Riechmann et al. f 1988; and Presta, Curr. Op. Struct. Biol, 2, 593-596 
(1992)). 

5 4.13.4 HUMAN ANTIBODIES 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from 
human genes. Such antibodies are termed "human antibodies", or "fully human antibodies" 
herein. Human monoclonal antibodies can be prepared by the trioma technique; the human 

10 B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV 
hybridoma technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human 
monoclonal antibodies may be utilized in the practice of the present invention and may be 
produced by using human hybridomas (see Cote, et al, 1983. Proc Natl Acad Sci USA 80, 

1 5 2026-2030) or by transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et 
al„ 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227, 381 (1991); 
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 

20 introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 

25 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779- 
783 (1992)); Lonberg et al. (Nature 368, 856-859 (1994)); Morrison (Nature 368, 812-13 

(1994) ); Fishwild et al, (Nature Biotechnology 14, 845-51 (1996)); Neuberger (Nature 
Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol. 13, 65-93 

(1995) ). 

30 Human antibodies may additionally be produced using transgenic nonhuman animals 

that are modified so as to produce fully human antibodies rather than the animal's 
endogenous antibodies in response to challenge by an antigen. (See PCT publication 
WO94/02602). The endogenous genes encoding the heavy and light immunoglobulin chains 
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in the nonhuman host have been incapacitated, and active loci encoding human heavy and 
light chain immunoglobulins are inserted into the host's genome. The human genes are 
incorporated, for example, using yeast artificial chromosomes containing the requisite 
human DNA segments. An animal which provides all the desired modifications is then 
5 obtained as progeny by crossbreeding intermediate transgenic animals containing fewer than 
the full complement of the modifications. The preferred embodiment of such a nonhuman 
animal is a mouse, and is termed the Xenomouse™ as disclosed in PCT publications WO 
96/33735 and WO 96/34096. This animal produces B cells that secrete fully human 
immunoglobulins. The antibodies can be obtained directly from the animal after 

10 immunization with an immunogen of interest, as, for example, a preparation of a polyclonal 
antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 

15 example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, 
lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment genes 
from at least one endogenous heavy chain locus in an embryonic stem cell to prevent 

20 rearrangement of the locus and to prevent formation of a transcript of a rearranged 
immunoglobulin heavy chain locus, the deletion being effected by a targeting vector 
containing a gene encoding a selectable marker, and producing from the embryonic stem cell 
a transgenic mouse whose somatic and germ cells contain the gene encoding the selectable 
marker. 

25 A method for producing an antibody of interest, such as a human antibody, is 

disclosed in U.S. Patent No. 5,916,771 . It includes introducing an expression vector that 
contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 
culture, introducing an expression vector containing a nucleotide sequence encoding a light 
chain into another mammalian host cell, and fusing the two cells to form a hybrid cell. The 

30 hybrid cell expresses an antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically 
relevant epitope on an immunogen, and a correlative method for selecting an antibody that 
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binds immunospecifically to the relevant epitope with high affinity, are disclosed in PCT 
publication WO 99/53049. 

4.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES 

5 According to the invention, techniques can be adapted for the production of 

single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent 
No. 4,946,778). In addition, methods can be adapted for the construction of F a b expression 
libraries (see e.g., Huse, et al., 1989 Science 246, 1275-1281) to allow rapid and effective 
identification of monoclonal F a b fragments with the desired specificity for a protein or 

10 derivatives, fragments, analogs or homologs thereof. Antibody fragments that contain the 
idiotypes to a protein antigen may be produced by techniques known in the art including, but 
not limited to: (i) an F( a b«)2 fragment produced by pepsin digestion of an antibody molecule; 
(ii) an F a b fragment generated by reducing the disulfide bridges of an F( a b»)2 fragment; (iii) an 
F a b fragment generated by the treatment of the antibody molecule with papain and a reducing 

1 5 agent and (iv) F v fragments. 

4.13.6 BISPECIFIC ANTIBODIES 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies 
that have binding specificities for at least two different antigens. In the present case, one of 
20 the binding specificities is for an antigenic protein of the invention. The second binding 
target is any other antigen, and advantageously is a cell-surface protein or receptor or 
receptor subunit 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 

25 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305, 537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) 
produce a potential mixture of ten different antibody molecules, of which only one has the 
correct bispecific structure. The purification of the correct molecule is usually accomplished 

30 by affinity chromatography steps. Similar procedures are disclosed in WO 93/08829, 
published 13 Mayl993, and in Traunecker et a/., 1991 EMBOJ., 10, 3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part 
of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant 
region (CHI) containing the site necessary for light-chain binding present in at least one of 
the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the 

5 immunoglobulin light chain, are inserted into separate expression vectors, and are co- 
transfected into a suitable host organism. For further details of generating bispecific 
antibodies see, for example, Suresh et al., Methods in Enzymology, 121, 210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a 
pair of antibody molecules can be engineered to maximize the percentage of heterodimers 

10 that are recovered from recombinant cell culture. The preferred interface comprises at least 
a part of the CH3 region of an antibody constant domain. In this method, one or more small 
amino acid side chains from the interface of the first antibody molecule are replaced with 
larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of identical or 
similar size to the large side chain(s) are created on the interface of the second antibody 

15 molecule by replacing large amino acid side chains with smaller ones (e.g. alanine or 

threonine). This provides a mechanism for increasing the yield of the heterodimer over other 
unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full-length antibodies or antibody fragments 
(e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from 

20 antibody fragments have been described in the literature. For example, bispecific antibodies 
can be prepared using chemical linkage. Brennan et al., Science 229, 81 (1985) describe a 
procedure wherein intact antibodies are proteolytically cleaved to generate F(ab')2 
fragments. These fragments are reduced in the presence of the dithiol complexing agent 
sodium arsenite to stabilize vicinal dithiols and prevent intermolecular disulfide formation. 

25 The Fab' fragments generated are then converted to thionitrobenzoate (TNB) derivatives. 
One of the Fab'-TNB derivatives is then reconverted to the Fab'-thiol by reduction with 
mercaptoethylamine and is mixed with an equimolar amount of the other Fab'-TNB 
derivative to form the bispecific antibody. The bispecific antibodies produced can be used 
as agents for the selective immobilization of enzymes. 

30 Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175, 217-225 (1992) 
describe the production of a fully humanized bispecific antibody F(ab') 2 molecule. Each 
Fab' fragment was separately secreted from E. coli and subjected to directed chemical 
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coupling in vitro to form the bispecific antibody. The bispecific antibody thus formed was 
able to bind to cells overexpressing the ErbB2 receptor and normal human T cells, as well as 
trigger the lytic activity of human cytotoxic lymphocytes against human breast tumor targets. 
Various techniques for making and isolating bispecific antibody fragments directly 
5 from recombinant cell culture have also been described. For example, bispecific antibodies 
have been produced using leucine zippers. Kostelny et al., J. Immunol. 148(5), 1547-1553 
(1992). The leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' 
portions of two different antibodies by gene fusion. The antibody homodimers were reduced 
at the hinge region to form monomers and then re-oxidized to form the antibody 

10 heterodimers. This method can also be utilized for the production of antibody homodimers. 
The "diabody" technology described by Hollinger et al., Proc. Natl Acad. Sci. USA 90, 
6444-6448 (1993) has provided an alternative mechanism for making bispecific antibody 
fragments. The fragments comprise a heavy-chain variable domain (V h) connected to a 
light-chain variable domain (Vl) by a linker which is too short to allow pairing between the 

15 two domains on the same chain. Accordingly, the Vh and Vl domains of one fragment are 
forced to pair with the complementary V L and V H domains of another fragment, thereby 
forming two antigen-binding sites. Another strategy for making bispecific antibody 
fragments by the use of single-chain Fv (sFv) dimers has also been reported. See, Gruber et 
al, J. Immunol 152, 5368 (1994). 

20 Antibodies with more than two valencies are contemplated. For example, trispecific 

antibodies can be prepared. Tutt et al, J. Immunol 147, 60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 
which originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm 
of an immunoglobulin molecule can be combined with an arm which binds to a triggering 

25 molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), 
or Fc receptors for IgG (Fc-)R), such as Fc-yRI (CD64), Fc-yRII (CD32) and Fc-yRIII (CD16) 
so as to focus cellular defense mechanisms to the cell expressing the particular antigen. 
Bispecific antibodies can also be used to direct cytotoxic agents to cells which express a 
particular antigen. These antibodies possess an antigen-binding arm and an arm which binds 

30 a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, DOTA, or TETA. 

Another bispecific antibody of interest binds the protein antigen described herein and further 
binds tissue factor (TF). 
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4.13.7 HETEROCONJUGATE ANTIBODIES 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted cells 

5 (U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 
92/200373; EP 03089). It is contemplated that the antibodies can be prepared in vitro using 
known methods in synthetic protein chemistry, including those involving crosslinking 
agents. For example, immunotoxins can be constructed using a disulfide exchange reaction 
or by forming a thioether bond. Examples of suitable reagents for this purpose include 

10 iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for example, in U.S. 
Patent No. 4,676,980. 

4.13.8 EFFECTOR FUNCTION ENGINEERING 

It can be desirable to modify the antibody of the invention with respect to effector 
15 function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing 
interchain disulfide bond formation in this region. The homodimeric antibody thus 
generated can have improved internalization capability and/or increased complement- 
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See Caron et 
20 al., J. Exp Med., 176, 1 191-1 195 (1992) and Shopes, J. Immunol., 148, 2918-2922 (1992). 
Homodimeric antibodies with enhanced anti-tumor activity can also be prepared using 
heterobifunctional cross-linkers as described in Wolff et al. Cancer Research, 53, 2560- 
2565 (1993). Alternatively, an antibody can be engineered that has dual Fc regions and can 
thereby have enhanced complement lysis and ADCC capabilities. See Stevenson et al., 
25 Anti-Cancer Drug Design, 3, 219-230 (1989). 

4.13.9 IMMUNOCONJUGATES 

The invention also pertains to immunoconjugates comprising an antibody conjugated 
to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active 
30 toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive 
isotope (i.e., a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 
been described above. Enzymatically active toxins and fragments thereof that can be used 
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include diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A 
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, 
alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins 
(PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria 
5 officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, enomycin, and the 

tricothecenes. A variety of radionuclides are available for the production of radioconjugated 
antibodies. Examples include 2l2 Bi, U1 1, 13l In, 90 Y, and 186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of 
bifunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldi1hiol) propionate 

10 (SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl 
adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes (such as 
glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) hexanediamine), bis- 
diazonium derivatives (such as bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates 
(such as tolyene 2,6-diisocyanate), and bis-active fluorine compounds (such as 1,5-difluoro- 

1 5 2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as described in 
Vitetta et al., Science, 238: 1098 (1987). Carbon- 14-labeled l-isothiocyanatobenzyl-3- 
methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating agent for 
conjugation of radionucleotide to the antibody. See W094/1 1026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 

20 streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 



4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention 
can be recorded on computer readable media. As used herein, "computer readable media" 
refers to any medium which can be read and accessed directly by a computer. Such media 
30 include, but are not limited to: magnetic storage media, such as floppy discs, hard disc 
storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical 
storage media such as RAM and ROM; and hybrids of these categories such as 
magnetic/optical storage media. A skilled artisan can readily appreciate how any of the 



WO 2004/080148 



PCT/US2003/030720 



97 

presently known computer readable mediums can be used to create a manufacture 
comprising computer readable medium having recorded thereon a nucleotide sequence of the 
present invention. As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently known 
5 methods for recording information on computer readable medium to generate manufactures 
comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means 

10 chosen to access the stored information. In addition, a variety of data processor programs 
and formats can be used to store the nucleotide sequence information of the present 
invention on computer readable medium. The sequence information can be represented in a 
word processing text file, formatted in commercially-available software such as WordPerfect 
and Microsoft Word, or represented in the form of an ASCII file, stored in a database 

15 application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any 
number of data processor structuring formats (e.g. text file or database) in order to obtain 
computer readable medium having recorded thereon the nucleotide sequence information of 
the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-684, or 1369-1966 or a 

20 representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO: 1-684, or 1369-1966 in computer readable form, a 
skilled artisan can routinely access the sequence information for a variety of purposes. 
Computer software is publicly available which allows a skilled artisan to access sequence 
information provided in a computer readable medium. The examples which follow 

25 demonstrate how software which implements the BLAST (Altschul et al, J. Mol. Biol. 

215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search 
algorithms on a Sybase system is used to identify open reading frames (ORFs) within a 
nucleic acid sequence. Such ORFs may be protein-encoding fragments and may be useful in 
producing commercially important proteins such as enzymes used in fermentation reactions 

30 and in the production of commercially useful metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the 



WO 2004/080148 



PCTYUS2003/030720 



98 

present invention comprises a central processing unit (CPU), input means, output means, and 
data storage means. A skilled artisan can readily appreciate that any one of the currently 
available computer-based systems are suitable for use in the present invention. As stated 
above, the computer-based systems of the present invention comprise a data storage means 
5 having stored therein a nucleotide sequence of the present invention and the necessary 
hardware means and software means for supporting and implementing a search means. As 
used herein, "data storage means" refers to memory which can store nucleotide sequence 
information of the present invention, or a memory access means which can access 
manufactures having recorded thereon the nucleotide sequence information of the present 
10 invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target structural 
motif with the sequence information stored within the data storage means. Search means are 
used to identify fragments or regions of a known sequence which match a particular target 

1 5 sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the 
computer-based systems of the present invention. Examples of such software includes, but 
is not limited to, Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA 
(NPOLYPEPTIDEIA). A skilled artisan can readily recognize that any one of the available 

20 algorithms or implementing software packages for conducting homology searches can be 
adapted for use in the present computer-based systems. As used herein, a "target sequence" 
can be any nucleic acid or amino acid sequence of six or more nucleotides or two or more 
amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the 
less likely a target sequence will be present as a random occurrence in the database. The 

25 most preferred sequence length of a target sequence is from about 10 to 300 amino acids, 
more preferably from about 30 to 100 nucleotide residues. However, it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in 
gene expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 

30 selected sequence or combination of sequences in which the sequence(s) are chosen based on 
a three-dimensional configuration which is formed upon the folding of the target motif. 
There are a variety of target motifs known in the art. Protein target motifs include, but are 
not limited to, enzyme active sites and signal sequences. Nucleic acid target motifs include, 
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but are not limited to, promoter sequences, hairpin structures and inducible expression 
elements (protein binding sequences). 

4.15 TRIPLE HELIX FORMATION 

5 In addition, the fragments of the present invention, as broadly described, can be used 

to control gene expression through triple helix formation or antisense DNA or RNA, both of 
which methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and 
are designed to be complementary to a region of the gene involved in transcription (triple 

10 helix-see Lee et aL, Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 15241, 456 
(1988); and Dervan et al, Science 251, 1360 (1991)) or to the mRNA itself (antisense- 
Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of 
Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally 
results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization 

1 5 blocks translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the sequences of 
the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide. 

20 4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify die presence or expression 
of one of the ORFs of the present invention, or homolog thereof, in a test sample, using a 
nucleic acid probe or antibodies of the present invention, optionally conjugated or otherwise 
associated with a suitable label. 

25 In general, methods for detecting a polynucleotide of the invention can comprise 

contacting a sample with a compound that binds to and forms a complex with the 
polynucleotide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polynucleotide of the invention is detected in the sample. 
Such methods can also comprise contacting a sample under stringent hybridization 

30 conditions with nucleic acid primers that anneal to a polynucleotide of the invention under 
such conditions, and amplifying annealed polynucleotides, so that if a polynucleotide is 
amplified, a polynucleotide of the invention is detected in the sample. 
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In general, methods for detecting a polypeptide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the 
polypeptide for a period sufficient to form the complex, and detecting the complex, so that if 
a complex is detected, a polypeptide of the invention is detected in the sample. 

5 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying 
for binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 

10 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. 
One skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic 
acid probes or antibodies of the present invention. Examples of such assays can be found in 
Chard, T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science 

15 Publishers, Amsterdam, the Netherlands (1986); Bullock, G.R. et al., Techniques in 

Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 
(1985); Tijssen, P., Practice and Theory of immunoassays: Laboratory Techniques in 
Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The 
Netherlands (1985). The test samples of the present invention include cells, protein or 

20 membrane extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or 
urine. The test sample used in the above-described method will vary based on the assay 
format, nature of the detection method and the tissues, cells or extracts used as the sample to 
be assayed. Methods for preparing protein extracts or membrane extracts of cells are well 
known in the art and can be readily be adapted in order to obtain a sample which is 

25 compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the 
invention provides a compartment kit to receive, in close confinement, one or more 
containers which comprises: (a) a first container comprising one of the probes or antibodies 

30 of the present invention; and (b) one or more other containers comprising one or more of the 
following: wash reagents, reagents capable of detecting presence of a bound probe or 
antibody. 
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In detail, a compartment kit includes any kit in which reagents are contained in 
separate containers. Such containers include small glass containers, plastic containers or 
strips of plastic or paper. Such containers allows one to efficiently transfer reagents from 
one compartment to another compartment such that the samples and reagents are not 

5 cross-contaminated, and the agents or solutions of each container can be added in a 
quantitative fashion from one compartment to another. Such containers will include a 
container which will accept the test sample, a container which contains the antibodies used 
in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tris-buffers, etc.), and containers which contain the reagents used to detect the bound 

10 antibody or probe. Types of detection reagents include labeled nucleic acid probes, labeled 
secondary antibodies, or in the alternative, if the primary antibody is labeled, the enzymatic, 
or antibody binding reagents which are capable of reacting with the labeled antibody. One 
skilled in the art will readily recognize that the disclosed probes and antibodies of the present 
invention can be readily incorporated into one of the established kit formats which are well 

15 known in the art. 

4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
20 invention is involved in the immune response, for imaging sites of inflammation or 
infection). See, e.g., Kunkel etal., U.S. Pat. NO. 5,413,778. Such methods involve 
chemical attachment of a labeling or imaging agent, administration of the labeled 
polypeptide to a subject in a pharmaceutical^ acceptable carrier, and imaging the labeled 
polypeptide in vivo at the target site. 

25 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present 
invention further provides methods of obtaining and identifying agents which bind to a 
polypeptide encoded by an ORP" corresponding to any of the nucleotide sequences set forth 
30 in SEQ ID NO: 1-684, or 1369-1966, or bind to a specific domain of the polypeptide 
encoded by the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the 
present invention, or nucleic acid of the invention; and 
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(b) determining whether the agent binds to said protein or said nucleic acid. 

In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide 
of the invention for a time sufficient to form a polynucleotide/compound complex, and 
5 detecting the complex, so that if a polynucleotide/compound complex is detected, a 
compound that binds to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to 
a polypeptide of the invention can comprise contacting a compound with a polypeptide of 
the invention for a time sufficient to form a polypeptide/compound complex, and detecting 
10 the complex, so that if a polypeptide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can 
also comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression 
15 of a receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound 
that binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
20 activity observed in the absence of the compound). Alternatively, compounds identified via 
such methods can include compounds which modulate the expression of a polynucleotide of 
the invention (that is, increase or decrease expression relative to expression levels observed 
in the absence of the compound). Compounds, such as compounds identified via the 
methods of the invention, can be tested using standard assays well known to those of skill in 
25 the art for their ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be 
selected and screened at random or rationally selected or designed using protein modeling 
techniques. 

30 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents 

and the like are selected at random and are assayed for their ability to bind to the protein 
encoded by the ORF of the present invention. Alternatively, agents may be rationally 
selected or designed. As used herein, an agent is said to be "rationally selected or designed 11 
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when the agent is chosen based on the configuration of the particular protein. For example, 
one skilled in the art can readily adapt currently available procedures to generate peptides, 
pharmaceutical agents and the like, capable of binding to a specific peptide sequence, in 
order to generate rationally designed antipeptide peptides, for example see Hurby et al., 

5 Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W.H. Freeman, NY (1992), pp. 289-307, and Kaspczak et al., Biochemistry 
28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or 

10 EMFs of the present invention. As described above, such agents can be randomly screened 
or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single 
ORF or multiple ORFs which rely on the same EMF for expression control. One class of 
DNA binding agents are agents which contain base residues which hybridize or form a triple 

1 5 helix formation by binding to DNA or RNA. Such agents can be based on the classic 

phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric 
derivatives which have base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - 

20 see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 241, 456 (1988); and 
Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-Okano, J. 
Neurochem. 56, 560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene 
Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally results in 
a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks 

25 translation of an mRNA molecule into polypeptide. Both techniques have been 

demonstrated to be effective in model systems. Information contained in the sequences of 
the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention 

30 can be used as a diagnostic agent. Agents which bind to a protein encoded by one of the 
ORFs of the present invention can be formulated using known techniques to generate a 
pharmaceutical composition. 
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4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic 
acid hybridization probes capable of hybridizing with naturally occurring nucleotide 
sequences. The hybridization probes of the subject invention may be derived from any of 
5 the nucleotide sequences SEQ ID NO: 1-684, or 1369-1966. Because the corresponding 
gene is only expressed in a limited number of tissues, a hybridization probe derived from 
any of the nucleotide sequences SEQ ID NO: 1-684, or 1369-1966 can be used as an 
indicator of the presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 

10 hybridization. PCR as described in US Patents Nos. 4,683, 1 95 and 4,965,1 88 provides 

additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used 
in PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. 
The probe will comprise a discrete nucleotide sequence for the detection of identical 
sequences or a degenerate pool of possible sequences for identification of closely related 

1 5 genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such 
vectors are known in the art and are commercially available and may be used to synthesize 
RNA probes in vitro by 'means of the addition of the appropriate RNA polymerase as T7 or 

20 SP6 RNA polymerase and the appropriate radioactively labeled nucleotides. The nucleotide 
sequences may be used to construct hybridization probes for mapping their respective 
genomic sequences. The nucleotide sequence provided herein may be mapped to a 
chromosome or specific regions of a chromosome using well-known genetic and/or 
chromosomal mapping techniques. These techniques include in situ hybridization, linkage 

25 analysis against known chromosomal markers, hybridization screening with libraries or 
flow-sorted chromosomal preparations specific to known chromosomes, and the like. The 
technique of fluorescent in situ hybridization of chromosome spreads has been described, 
among other places, in Verma et al (1988) Human Chromosomes: A Manual of Basic 
Techniques, Pergamon Press, New York NY. 

30 Fluorescent in situ hybridization of chromosomal preparations and other physical 

chromosome mapping techniques may be correlated with additional genetic map data. 
Examples of genetic map data can be found in the 1994 Genome Issue of Science 
(265:1981f). Correlation between the location of a nucleic acid on a physical chromosomal 
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map and a specific disease (or predisposition to a specific disease) may help delimit the 
region of DNA associated with that genetic disease. The nucleotide sequences of the subject 
invention may be used to detect differences in gene sequences between normal, carrier or 
affected individuals. 

5 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, maybe readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly 
practiced using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those 

10 of skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy 
is to precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can 
be achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol. 28(6), 1469- 
72); using UV light (Nagata et aL, 1985; DaMen etaL, 1987; Morrissey & Collins, (1989) Mol. 
Cell Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et aL, 1988; 

1 5 1989); all references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et aL (1994) Proc. Natl. Acad. ScL USA 91(8), 
3072-6, describe the use of biotinylated probes, although these are duplex probes, that are 
immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads may be 

20 purchased from Dynal, Oslo. Of course, this same linking chemistry is applicable to coating 
any surface with streptavidin. Biotinylated probes may be purchased from various sources, 
such as, e.g., Operon Technologies (Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. 
Nunc Laboratories have developed a method by which DNA can be covalently bound to the 

25 microwell surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with 
secondary amino groups (>NH) that serve as bridgeheads for further covalent coupling. 
CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules may be bound 
to CovaLink exclusively at the 5 f -end by a phosphoramidate bond, allowing immobilization of 
more than 1 pmol of DNA (Rasmussen et aL, (1991) Anal. Biochem. 198(1) 138-42). 

30 The use of CovaLink NH strips for covalent binding of DNA molecules at the 5-end 

has been described (Rasmussen et al., (1991). In this technology, a phosphoramidate bond is 
employed (Chu et al., (1983) Nucleic Acids Res. 11(8) 6513-29). This is beneficial as 
immobilization using only a single covalent bond is preferred. The phosphoramidate bond joins 
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the DNA to the CovaLink NH secondary amino groups that are positioned at the end of spacer 
arms covalently grafted onto the polystyrene surface through a 2 ran long spacer arm. To link 
an oligonucleotide to CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus 
must have a 5-end phosphate group. It is, perhaps, even possible for biotin to be covalently 
5 bound to CovaLink^ and then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/fil) and 
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 1- 
methylimidazole, pH 7.0 (l-Melm?), is then added to a final concentration of 10 mM l-Melmy. 
A ss DNA solution is then dispensed into CovaLink NH strips (75 pl/well) standing on ice. 
10 Carbodiimide 0.2 M l-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC), 

dissolved in 10 mM 1-Melm7, is made fresh and 25 ^1 added per well. The strips are incubated 
for 5 hours at 50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; 
first the wells are washed 3 times, then they are soaked with washing solution for 5 min., and 
finally they are washed 3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS 
15 heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is 
that described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated 1 
herein by reference. This method of preparing an oligonucleotide bound to a support involves 
attaching a nucleoside 3 '-reagent through the phosphate group by a covalent phospihodiester link 
20 to aliphatic hydroxyl groups carried by the support. The oligonucleotide is then synthesized on 
the supported nucleoside and protecting groups removed from the synthetic oligonucleotide 
chain under standard conditions that do not cleave the oligonucleotide from the support. 
Suitable reagents include nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
25 arrays may be employed. For example, addressable laser-activated photodeprotection may be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described 
by Fodor et al. (1991) Science 251(4995), 767-73, incorporated herein by reference. Probes 
may also be immobilized on nylon supports as described by Van Ness et al (1991) Nucleic 
Acids Res., 19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) 
30 Anal. Biochem. 169(1), 104-8; all references being specifically incorporated herein. 

To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5 f -amine of 
oligonucleotides with cyanuric chloride. 



WO 2004/080148 



PCT/US2003/030720 



107 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al, (1994) Proc. Natl Acad. Sci., USA 91(1 1), 
5022-6, incorporated herein by reference). These authors used current photolithographic 
techniques to generate arrays of immobilized oligonucleotide probes (DNA chips). These 
5 methods, in which light is used to direct the synthesis of oligonucleotide probes in high-density, 
miniaturized arrays, utilize photolabile 5-protectedN-acyl-deoxynucleoside phosphoramidites, 
surface linker chemistry and versatile combinatorial synthesis strategies. A matrix of 256 
spatially defined oligonucleotide probes may be generated in this manner. 

4,21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

1 0 The nucleic acids may be obtained from any appropriate source, such as cDNAs, 

genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC 

inserts, and RNA, including mRNA without any amplification steps. For example, Sambrook 

et al (1989) describes three protocols for the isolation of high molecular weight DNA from 

mammalian cells (p. 9. 14-9.23). 
1 5 DNA fragments may be prepared as clones in M13, plasmid or lambda vectors and/or 

prepared directly from genomic DNA or cDNA by PCR or other amplification methods. 

Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA 

samples may be prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of 
20 skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 of 

Sambrook et al (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1990) 

Nucleic Acids Res. 18(24), 7455-6, incorporated herein by reference). In this method, DNA 

samples are passed through a small French pressure cell at a variety of low to intermediate 
25 pressures. A lever device allows controlled application of low to intermediate pressures to the 

cell. The results of these studies indicate that low-pressure shearing is a useful alternative to 

sonic and enzymatic DNA fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the 

two base recognition endonuclease, CvfJI, described by Fitzgerald et al (1992) Nucleic Acids 
30 Res. 20(14) 3753-62. These authors described an approach for the rapid fragmentation and 

fractionation of DNA into particular sizes that they contemplated to be suitable for shotgun 

cloning and sequencing. 
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The restriction endonuclease Cvi JI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the 
specificity of this enzyme (CWJI**), yield a quasi-random distribution of DNA fiagments foim 
the small molecule pUC19 (2688 base pairs). Fitzgerald et al (1992),quantitatively evaluated 
5 the randomness of this fragmentation strategy, using a CvzJI** digest of pUC19 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z 
minus M13 cloning vector. Sequence analysis of 76 clones showed that CvzJI** restricts 
pyGCPy and PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated 
at a rate consistent with random fragmentation. 

10 As reported in the literature, advantages of this approach compared to sonication and 

agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ng instead of 
2-5 jig); and fewer steps are involved (no preligation, end repair, chemical extraction, or 
agarose gel electrophoresis and elution are needed). 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, 

15 it is important to denature the DNA to give single stranded pieces available for hybridization. 
This is achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is 
then cooled quickly to 2°C to prevent renaturation of the DNA fragments before they are 
contacted with the chip. Phosphate groups must also be removed from genomic DNA by 
methods known in the art. 

20 422 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon 
membrane. Spotting may be performed by using arrays of metal pins (the positions of which 
correspond to an array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a 
DNA solution to a nylon membrane. By offset printing, a density of dots higher than the density 

25 of the wells is achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the 
type of label used. By avoiding spotting in some preselected number of rows and columns, 
separate subsets (subarrays) may be formed. Samples in one subarray may be the same genomic 
segment of DNA (or the same gene) from different individuals, or may be different, overlapped 
genomic clones. Each of the subarrays may represent replica spotting of the same samples. In 

30 one example, a selected gene segment may be amplified from 64 patients. For each patient, the 
amplified gene segment may be in one 96-well plate (all 96 wells containing the same sample). 
A plate for each of the 64 patients is prepared. By using a 96-pin device, all samples maybe 
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spotted on one 8 x 12 cm membrane. Subairays may contain 64 samples, one from each patient. 
Where the 96 subarrays are identical, the dot span may be 1 mm 2 and there may be a 1 mm 
space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, 
5 Illinois) which may be partitioned by physical spacers e.g. a plastic grid molded over the 
membrane, the grid being similar to the sort of membrane applied to the bottom of multiwell 
plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by exposure 
to flat phosphor-storage screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of 

1 0 the present disclosure, one of skill in the art will appreciate that many other embodiments and 
variations may be made in the scope of the present invention. Accordingly, it is intended that 
the broader aspects of the present invention not be limited to the disclosure of the following 
examples. The present invention is not to be limited in scope by the exemplified embodiments 
which are intended as illustrations of single aspects of the invention, and compositions and 

1 5 methods which are functionally equivalent are within the scope of the invention. Indeed, 

numerous modifications and variations in the practice of the invention are expected to occur to 
those skilled in the art upon consideration of the present preferred embodiments. Consequently, 
the only limitations which should be placed upon the scope of the invention are those which 
appear in the appended claims. 

20 All references cited within the body of the instant specification are hereby incorporated 

by reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

25 A plurality of novel nucleic acids were obtained from cDNA libraries prepared from 

various human tissues and in some cases isolated from a genomic library derived from human 
chromosome using standard PCR, SBH sequence signature analysis and Sanger sequencing 
techniques. The inserts of the library were amplified with PCR using primers specific for the 
vector sequences which flank the inserts. Clones from cDNA libraries were spotted on nylon 

30 membrane filters and screened with oligonucleotide probes (e.g., 7-mers) to obtain signature 
sequences. The clones were clustered into groups of similar or identical sequences. 
Representative clones were selected for sequencing. 
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In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied 
Biosystems (ABI) sequencer to obtain the novel nucleic acid sequences. 

5 5.2 EXAMPLE 2 

Assemblage of Nnvri Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 1369- 
1 966 were assembled using an EST sequence as a seed. Then a recursive algorithm was used to 
extend the seed EST into an extended assemblage, by pulling additional sequences from 
1 0 different databases (i.e., Hyseq's database containing EST sequences, dbEST, gb pri, and 
UniGene, and exons from public domain genomic sequences predicated by GenScan) that 
belong to this assemblage. The algorithm terminated when there were no additional sequences 
from the above databases that would extend the assemblage. Further, inclusion of component 
sequences into the assemblage was based on a BLASTN hit to the extending assemblage with 
1 5 BLAST score greater than 300 and percent identity greater than 95%. 

Table 7 sets forth the novel predicted polypeptides (including proteins), SEQ ID NO: 
1967-2564, encoded by the novel polynucleotides (SEQ ID NO: 1369-1966) of the present 
invention, and their corresponding translation start and stop nucleotide locations to each of SEQ 
ID NO: 1369-1966. Table 7 also indicates the method by which the polypeptide was predicted. 
20 Method A refers to a polypeptide obtained by using a software program called FASTY 
(available from http://fasta.biockvirginia.ed^ which selects a polypeptide based on a 
comparison of the translated novel polynucleotide to known polynucleotides (W.R. Pearson, 
Methods in Enzymology, 183:63-98 (1990), herein incorporated by reference). Melhod B 
refers to a polypeptide obtained by using a software program called GenScan for 
25 human/vertebrate sequences (available from Stanford University, Office of Technology 
Licensing) that predicts the polypeptide based on a probabilistic model of gene 
structure/compositional properties (C. Burge and S. Karlin, J. Mol. Biol., 268:78-94 (1997), 
incorporated herein by reference). Method C refers to a polypeptide obtained by using a Hyseq 
proprietary software program that translates the novel polynucleotide and its complementary 
30 strand into six possible amino acid sequences (forward and reverse frames) and chooses the 
polypeptide with the longest open reading frame. 



WO 2004/080148 



PCT/US2003/030720 



111 

5.3 EXAMPLE 3 
Novel Nucleic Acids 

The novel nucleic acids of the present invention were assembled from sequences that 
were obtained from a cDNA library by methods described in Example 1 above, and in some 
cases sequences obtained from one or more public databases. The nucleic acids were • 
assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend the 
seed EST into an extended assemblage, by pulling additional sequences from different 
databases (Hyseq's database containing EST sequences, dbEST, gb pri, and UniGene) that 
belong to this assemblage. The algorithm terminated when there was no additional sequences 
from the above databases that would extend the assemblage. Mclusion of component sequences 
into the assemblage was based on a BLASTN hit to the extending assemblage with BLAST 
score greater than 300 and percent identity greater lhan 95%. 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full-length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any 
frame shifts and incorrect stop codons were corrected by hand editing. During editing, the 
sequences were checked using FASTY and/or BLAST against Genebank (i.e., dbEST, gb pri, 
UniGene, and Genpept) and the Geneseq (Derwent). Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) 
and ed-ready, ed-ext and cg-zip-2 (Hyseq, Inc.). The full-length nucleotide and amino acid 
sequences, including splice variants resulting from these procedures are shown in the Sequence 
Listing as SEQ ID NO: 1-1 368. 

The nucleic acid sequences of the present invention were confirmed to have at least 
one transmembrane domain using the TMpred program 

(http://www.ch.embn et.org / sofr w a remiPRHn form htmU One of skill in the art will 
recognize that the proteins of the present invention may be utilized as either a membrane- 
bound target or a soluble protein. 

Table 1 shows the various tissue sources of SEQ ID NO: 1-684. 

The homologs for polypeptides SEQ ID NO: 685-1368 that correspond to nucleotide 
sequences SEQ ID NO: 1-684 were obtained by a BLASTP version 2.0al 19MP-WashU 
searches against Genpept and Geneseq (Derwent) using BLAST algorithm. The results 
showing homologues for SEQ ID NO: 685-1368 are shown in Tables 2A and 2B. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. 
Comp. Biol., Vol. 6, 219-235 (1999), htto://modf s t a nfnrH. ed u/ernarri y - Sfia r..W here in ' 
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incorporated by reference), all the polypeptide sequences were examined to determine 
whether they had identifiable signature regions. Scoring matrices of the eMatrix software 
package are derived from the BLOCKS, PRINTS, PFAM, PRODOM, and DOMO 
databases. Tables 3 A and 3B show the accession number of the homologous eMatrix 
5 signature found in the indicated polypeptide sequence, its description, and the results 

obtained which include accession number subtype; raw score; p-value; and the position of 
signature in amino acid sequence. 

Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 
26(1) pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences 

1 0 were examined for domains with homology to certain peptide domains. Tables 4 A and 4B 
show the name of the Pfam model found, the description, the e-value and the Pfam score for 
the identified model within the sequence. Further description of the Pfam models can be 
found at http://pfam.wustl.edu/ . 

Table 5 shows the position of the signal peptide in each of the polypeptides and the 

1 5 maximum score and mean score associated with that signal peptide using Neural Network 
SignalP VI .1 program (from Center for Biological Sequence Analysis, The Technical 
University of Denmark). The process for identifying prokaryotic and eukaiyotic signal 
peptides and their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, 
Soren Brunak, and Gunnar von Heijne in the publication " Identification of prokaryotic and 

20 eukaiyotic signal peptides and prediction of their cleavage sites" Protein Engineering, Vol. 
10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and a mean 
S score, as described in the Nielson et al reference, was obtained for the polypeptide 
sequences. 

Table 6 correlates nucleotide sequences of the invention to a specific chromosomal 
25 location when assignable. 

Table 8 shows the number of transmembrane regions, their Iocation(s), and TMPred 
score obtained, for each of the SEQ ID NO: 685-1368 that had a TMPred score of 500 or 
greater, using the TMpred program 

flittp://ww.ch.embnetorg/software/TMPRED form.htmn . 
30 Table 9 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 1- 

684, their corresponding polypeptide sequences SEQ ID NO: 685-1368, their corresponding 
priority contig nucleotide sequences SEQ ID NO: 1369-1966, their corresponding priority 
contig polypeptide sequences SEQ ID NO: 1967-2564, and the US serial number of the 
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priority application (all of which are herein incorporated in their entirety), in which the 
contig sequence was filed. 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ ID NOS: 


adult brain 


GIBCO 


AB3001 


39-40 56 68 93 








154-155 189 205 








215 221 229 245 








289-290 296 298 








305' 307 314 324 








346 362 376 384 








438 444 493 499 








502 532 563 612 








624 654 668 


adult brain 


GIBCO 


ABD003 


10 13 15 17-20 27 








29 34 40 47-49 56 








61-63 66 68 75 80- 








82 86 93-94 96 98 








102 106 137 150 








154 156-159 161 








168-169 173-174 








179 188 205 210 








212 215 221 229- 








231 243 245 290 








296 302 305 307 








313-315 319-320 








323 325 331 346 








349 352 359 362 








367 371 376 384 








420-421 428 438 








444 447 461-462 








473-474 487 493 








499 516 519 522- 








523 529 532 541 








550 563 587-588 








601 612 616 624 








627 635 643 652- 








654 660 669 672- 








673 677-678 


adult brain 


Clontech 


ABR001 


7 18 22 24 29 47- 








50 56 68 70 75 79 








112-113 152 161 








186 205-206 212 








220 230 259-262 








280 282 296 302 








346 361 376 384 








420 465 488-489 








492 518 520 587 








595 620-621 652 








660 682 


adult brain 


Clontech 


ABR006 


7-8 10 13 16 20-21 








23 27 34 37 40 53 








56 64-65 69-70 73- 








74 79 88-89 92 100 








104-105 147-150 








160-161 170 186 








200 207 212 229- 








230 243 256 259- 








262 266 275-278 








280 282-283 287 








289-290 307 309 








314-315 317-318 








321-322 325 337- 
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TABLE 1 



Tissue Oripin 


LriDrary/jKlN A Source 


HYSEO Library Name 


SEQ ID NOS: 








338 349-352 357 








359-360 364 377 








384 430 447-448 j 








461 466 484 499 | 








501 503 518 520 








530 532 542-546 








552 556 562-563 








569-571 600 607- j 








616 620-621 623- j 








625 628-629 641- 








642 653 660 672- 








673 677-678 682 ) 


aauii; Dram 


CI on tech 


ABR008 


7-8 10 14 19 21 23 








25-28 30-33 37-39 | 








43 46-50 52-53 56- 








57 59 62-65 67-68 








73-76 86-89 92-94 i 








104-105 118 131- 








134 139-140 144 








147-148 150 153- j 








154 160-165 170 








180 186 189 205- ! 








206 208-212 218- j 








219 223 229-230 j 








232-234 236 242- | 








245 249 259-263 








266 268 270 273 








283-289 293 298 








302 305 307-308 j 








313-316 318-324 








334-335 337-341 








343 346 349 351 J 








356 359 361-364 ! 








367 371 377 381 ! 








384 387-388 390 








403-404 419 423- 








425 431 435-436 








438 440-441 445- 








451 462 473-475 








484 493 498-501 | 








504-506 509 512 [ 








514-522 525 527 








529-530 532 534 








543-545 550 558 j 








562-564 569 576 








583-584 591 597- 








599 601-602 605 








607-610 620-621 








624-625 627-628 








631-632 638-640 








652-653 660 663 








665 670-671 


adult brain 


Clontech 


ABR011 


289 384 537 


adult brain 


BioChain 


ABR012 


26 384 607 ~_\ 


adult brain 


BioChain 


ABR013 


20 79 153 220 289 j 








384 465 526 ! 


adult brain 


Invitrogen 


ABR014 


48-50 52 106 170 








230 335 384 430 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQIDNOS: 








438 501 530 536 








635 643 


adult brain 


Invitrogen 


ABR015 


20 46 106 150 153 








216 371 384 401 








461 526 643 


adult brain 


Invitrogen 


ABR016 


60 69 153 368 384- 








385 507 522 587 








654 


adult brain 


Invi trogen 


ABT004 


10 16 24 29 43 47- 








49 56 60 64 67-69 

^ WW W w W * W f \J ^ 








73 79 97-98 165 








168-170 179 186 








189 205 230 242- 

^ W W A* W b J V £f * 4fa* 








247 249 259-263 








289-290 296 298 








305 308-310 314- 








315 319 329-330 

J -L J J i. ^ J A J J _} C 








332-333 349 359 








380 384-385 387- 








388 390 428 451 








456-457 475 487- 








490 492-493 499- 








500 512 519-520 








522 529-530 587 








612 620-621 643 








654 663 665 


cultured 


Stratagene 


ADP001 


10 19-20 23 26 36 


preadipocytes 






68 70 106 116-117 








147-148 165 171- 








172 189 220 246- 








247 256 273 289 








305 316-319 329- 








330 349 351 361 








365 392 394-398 








400 423-424 428 








451 465 487 499 








507 522 529 534 








543 587 643 672- 








673 682 


adrenal gland 


Clontech 


ADR002 


10 18 25 27 29 47- 








49 52-53 56 64 73- 








75 83 87 90 100 








106 110 124 130 








137 144 160-161 








163 182 189 198 








200 202-203 208 

« WW «4 W » w W a w W 








211-212 215 217 








220 237-241 249 








251 259-263 280 








289-293 296 317- 








319 329-331 344- 








345 359 362 371 








377 384 390 403- 








404 423-424 426 








465 499-501 507 








516 522 525 539 








570 572-573 585 








600-601 611 620- 



WO 2004/080148 



PCT/US2003/030720 



117 
TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ ID NOS: 








621 
643 
673 


623 
660 
675 


-624 
663 


635 
672- 


adult heart 


GIBCO 


AHR001 


5 16 18 


24- 


26 34 








37 


39 46 56 


64 66- 








68 


75 77 83 


86-89 








92 


94-97 10 


1-102 








104 


-106 


110 


134 








150 


154 


158 


-159 








162 


168 


-170 


194- 








196 


202 


-203 


212 








215 


224 


-226 


229 








269 


289 


296 


302 








306 


308 


-309 


314 








320 


323 


-324 


331 








336 


-338 


342 


346 








356 


367 


371 


377- 








378 


384 


-385 


390 








400 


402 


417 


-418 








421 


428 


431 


436 








438 


447 


461 


-462 








475 


479 


484 


-485 








491 


498 


501 


507 








516 


518 


522 


-525 








530 


532 


534 


541 








554 


564 


570 


572- 




- 




573 


586- 


-587 


601 








605 


607 


610 


613- 








614 


635 


643 


652 








662 


669 


672- 


-673 


adult kidney 


GIBCO 


AKD001 


5 10 12- 


-13 16 18 








20 24-26 29 


39 43 








52 54 56 62- 


•64 66 








68 71-72 75- 


•76 83 








89-96 98 106-109 








112- 


-114 


116- 


•117 








122- 


-126 


131 


137 








139 


155 


158- 


•159 








162 


170 


172- 


•174 








177 


183- 


•184 


188 








200 


202- 


•203 


.205 








208 


215- 


•216 


218- 








219 


229- 


•230 


245 








247 


256 


268 


272 








275- 


•278 


289- 


•290 








296 


298- 


299 


302 








308- 


■309 


314 


316 








319- 


-320 


323 


329- 








330 


332- 


333 


336 








350 


359- 


360 


364" 








367- 


•368 


371 


377 








384 


392- 


393 


400 








402 


420 


423- 


424 








428 


431 


435- 


436 








438 


444 


451 


461 








473- 


474 


484- 


486 








492- 


493 


499- 


500 








504- 


507 


510 


516 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEO ID NOS: 








518-519 521 


-522 








524 526 529 


-530 








532 534 537 


539 








541 567-568 


587- 








588 613 620 


-621 








623 631-632 


635 








643 652 654 


664 








668 672-673 




adult kidney 


Invitrogen 


AKT002 


6 8 10 14-15 17 20 








24-25 29 33 


-34 40 








46-50 64 67 


75 80- 








82 85 88 93 


-94 106 








116-117 126 


150 








154 157 162 


-164 








168-169 188 


199 








216-219 222 


2 32- 








234 255-256 


271 








275-278 289 


296 








298 308 312 


317- 








319 332-333 


337- 








338 348 358 


360 








368 370-371 


384 








390 400 421 


430 








435 438 451 


461- 








462 491-493 


499- 








501 507 509 


516 








518 520 522 


524 








530 535-537 


552 








564 567-568 


580 








587 597-599 


607 








631-632 635 


643 








652 662 666 


669 








672-673 675 


677- 








679 




adult lung 


GIBCO 


ALG001 


13 22 26 63 


66 68 








75 93 106 112-114 








127-130 137 


144 








150 165 177 


230 








256 271 289 


302 








314 323 327 


337 








342-343 368 


371 








384 390 392 


-393 








421 484 488 


-489 








504-507 539 


564 








638-639 643 


661 








675 




lymph node 


Clontech 


ALN001 


13 26 33 54 


56 








128-131 135 


150 








166 173-174 


202- 








203 211 215 


-216 








256 259-262 


289 








320 327 350 


367- 








368 371 465 


507 








509 526 643 


669 


young liver 


GIBCO 


ALV001 


5 10 13 24-25 43- 








44 56 67-68 


71 80- 








82 89 106 110-111 








132-133 137 


154 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ ID NOS: 








168-170 179 183- 








184 205 218-219 








221 229 275-278 








296 302 320 367 








371 390 428 438 








487-490 498 502 








507 525 530 538 








635 641-643 651 








666 


adult liver 


Invitrogen • 


ALV002 


5 14 16-17 19 24- 








25 37 52 64 66 68 








80-82 87 90 93 97- 








98 104-105 132-133 








137 140 150 170 








183 186 188 215 








218-220 229 232- 








234 249 256 272 








275-278 289 294- 








295 311-312 314 








319 332-333 351 








358-359 364 366 








371 377 381 386- 








387 392-393 428 








449 451 465 487- 








489 495-498 518 








522 538 593 601 








607 610 631-632 








643 666 


adult liver 


Clontech 


ALV003 


7 18-19 24 38 46 








180 186 216 220 








222 249 275-278 








371 390 427 465 








495 499 530 538 






* 


623 627 632 666 








679-680 


adult ovary 


Invitrogen 


AOV001 


5 7-8 10 12 14 16 








18 20 25-27 29 33 


* 






36 38-40 47-49 53- 








54 56 59 61-62 64 








67-68 73-76 79-83 








87 89 92-94 96 98 








106-107 111-114 








116-118 121 128- 








131 134-135 137 








139-142 150 153- 








154 157-161 171- 








177 179-180 182 








187 189 194-198 








200 202-203 205- 








206 211 218-219 








222 229-230 235- 








241 245 249 251 








254-256 259-264 








267 272 282 289- 








290 296 298-299 








302 305-306 308 








311-314 316 320 








323-325 327 331- 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ ID NOS: 








333 336 342 346- 








347 349-351 358 








362 367-368 371 








377 380 383-384 








390 392-393 400 








402 420-423 425 








427-428 435-436 








438 444 451 454 








459-462 471 473- 








474 484 487-489 








491 493 498-499 








501-502 504-507 








511 516 518 521- 








522 524 530 532 








539 543 547-550 








555-556 564-565 








581 587 593 595 








602 605 607 616 








620-621 623-624 








631-632 635 643 








652-654 660 667- 








669 679-680 


adult placenta 


Clontech 


APL001 


1-4 63-64 66 143 






145-146 178 211 








216 289 296 323 








351 384 537 630 


r»T acenta 

Y-J -L U — 11 U» u. 


Invitrogen 


APL002 


1-4 7 51 68 85 98 








151-152 192 208 








215 256 259-262 








305 319 332-333 








384 428 499 533 








602 627 654 666 




GIB CO 


ASP001 


7 13-14 17 26 32 








52 54 56 63 75 89 








106 109 112-115 








120 135 137 141- 








142 144 154 157 








173-174 179-180 








186 205 208 216 








220-222 229 252 








256 259-262 272 








279 289 296 298 








302 308 312 319- 








320 337-338 347 








364 367-368 371 








384 400 427 438 








451 459-461 465 








484 487 500 504- 








507 522 525-526 








530 534 555 587 








593 617-618 631- 








633 635 638-639 








643 663 669 675- 








676 679 


adult testis 


GIBCO 


ATS001 


5 10 19 29 39 64 








68 93 100 106 116- 








117 137 145-146 








150 153 172 175- 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEO ID NOS: 








176 181-182 198 








202-203 229 249 








256 267 289 296 








298 302 305 307- ! 








308 314 316 323 








331 356 359 362 








364 371 384 402 








426 438 451 485 








500 507 518-519 








591 597-599 619- 








621 643 654 662 


adult bladder 


Invitrogen 


BLD001 


5 10 26 51 65 68 








84 89 93 131 175- 








176 211 256 259- 








262 267 289 314 








317-318 332-333 . 








351 383-384 395- 








398 423-424 426 








499 501 522 525 








580 593 643 661 








682 


bone marrow 


Clontech 


BMD001 


5 7 30-31 34 37 40 








47-49 54-56 62 '68 








75-80 83 93 96 100 








131 136 147-148 








150 158-159 163 








165 172 177 198 








204 206 211 216 








229 289 302 308 








316 319-320 324- 








325 337-338 350 








358 364 367-368 








371 400 422 428 








438 452 454 461 








478 484 487 491 








499-502 507 509- 








510 520 530 536- 








537 541 543 554 








587 624 638-639 








643 651-652 654 








667-669 672-673 


bone marrow 


GF 


BMD002 


7-8 12 14 17 20 25 








27-28 32-33 37 43 








52 57 63-64 66-68 








77 87 100 102 106- 








107 112-114 lie- 








ns 120 131 136- 








137 144 147-148 








150-153 157-159 








163 172 179 199 








206 215-216 222 








256 259-263 268 








272 275-278 286 








289 298 302-303 








305 308 317-318 








325 337-338 341 








343 347-348 368 








371 390 400 427- 
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TABLE 1 



Tissue Oricnn 


TJhrarv/RlVA ^niirrA 


FTVSTCO T ihrarv Namp 

JLJL A OJU/V^ l_ill7I <U j lialllC 










AOfl 47n-471 A7A 
*±ZO i ±0U-401 








4"3C 4*17 AAA Atzi 
•sOO *iO / ftftft 4bl 








AC1 — 4C9 /too y* Q O 
4ol-*fcOZ *iOO-4oy 








** y 1 - .4 y Z 4 y y b U 1 - 
















511 516 5,90. RSR- 








526 530 537 547 








554 558 560-561 








585 587 595 600 








610 623 629 631- 








633 635 638-640 








643 667-669 672- 








673 679 


ho tip marrow 


CM on r poVi 


RMD004 


507 522 






I3L*1X' Uu / 


-ifTQ c,04- c ;0£ fi79 

J DO JUI'JUD O/Z 


"rJlXLUlc OE XO 


various venQors 


ppHm n 

I.UUUJ.U 


QQ i *jo ice: 
yy ljz-ijj loo 


f- H eqiiQe — mPMZl 

LlOOUCb UUUMrl 






077-941 97E.-97R 
4j / Z^i Z / 3 — Z / O 








9Q0 9QP. "infi ITC 

ZJ7U Z~70 OUD JJO 








76ft IfiH 409 497- 








494 cnQ "^CIC epe 

*i Z t 0\J^7 JJO jOD 








610 




VQilOUo vcuuOiS 


LbUUll 


77 49 1 CI 1 CQ 


Lisa U.C is UuvLVrl 






1 7p 9 1 7 _ 9 1 A OAK 
I/O Z*±D 








947 4C7 C.9C c.77 
Z*4/ "io / OZ O JJ / 








c ,79- c ,77 67 5 


^HlXCUTc Ot lo 


various venaois 


LbQUlz 


C 1/1 1Q O 1 *") /I n 

D 14 la zl z4 il 


f- -! c cnoc ryiTD VT 7\ 
Llbbucb UlivWii. 






7*3 7C 7Q A9 4/ AC 
JO J J *iz ♦5*2 -Tio 








CI C *3 CQ CI C*5 *7A 

o± bj bo ox-oz /U- 








79 nc an qa qc on 
/z / D oU oft-oD yu 








Q9 Q7 QC QQ i nn 

yz-yj yo yo iuu- 








i f|l 197 171 1 AA 
1UJ XZ / 1J1 X*t*±- 








i AC 1t^7-1C4 1 tr»7 
X40 ISO-ID'S ib/ 








i en i ci i C7 i cr; 
lou-ioi loo lob 








ICQ — ICQ 17C.-17C 

loo- xoy x/b-i/o 








17Q-17Q 1 Q7 1 Qr; 
i/o-i/y lOO lob 








i on i Q7 onn 91 Q 
loy iyo zuu zio- 








01 Q 791 99Q 979_ 

ziy zzi zzy zoz- 








97A 94C 9/1*7 9 CC 
ZJfi Z*ib ^f.^ / ZOO 








OCQ 9C9 97"^-97Q 

zby— zoz z/b— z/o 








9Qn 9RQ-9Q9 9QR 

zou zoy-zyz zj?o 








7nn 7ni 7no 7i i 
oUU-oUl ouo oil 








71*7 71Q 77C 77C_. 

ol / -olo ozb oob- 








77Q 7/4 9 *JAA 74*7 

o oo oftz oftfl-oft/ 








TlVlQ lO *icc *3CC 

o4y obz obb-obo 








7CQ 7Cn 7CQ 7*7n 

oby-ooU ooo 0/0- 








o /b ooU oo4-oob 








TOO *301 OO/l *JOQ 

odd jyi oy4-jyy 








4U1-4UZ 4Ub-40/ 








410 419-417 41 Q 

? J.U *xX^ 1XO *±X^7 








428 450-451 464 








467-469 471 504- 








507 512 516 518 








524 526 532 537 








541 545 547-549 








554 556 563-564 








572-573 586 590- 








591 600 602 605 








623-625 627-628 
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TABLE 1 



Tissue Oriffin 


Library/RNA Source 


HYSEQ Library Name 


SEQ ID NOS: 








652 654 659-660 








664 667 670-671 








676 682 


"rllAture Ot 1Q 


vallUua v cnuui o 


CGd013 

V^VJVX U J J 


56 58 61-62 70 131 


Lib sue b ulKJN/V 






160-161 163-164 








1 93 247 2 90 311 








345 348 360 36B 








370 394-398 512 








537 556 660 682 




vaxxuuo vcuuuio 


PGdOl 5 


1-5 8 14 17 52 59 


f~ *i eciioc — ml? NTH 






68 87 215 228 259- 








262 272 275-278 








989 309 371 377 








392-393 400 402 








420 446-447 451 








492 498 504-506 








514 521 537-538 








588 620-621 637 








643 654 672-675 


: * ^ 

*Mixtui*"e of 16 


vdrious venuoxs 


LOUU J. D 


10 14 19 94 -28 33 


tissues - mRNA 






57 65 70 76 112- 








114 121 131 151- 








153 163 183 206 








218-219 325 328 








332-333 394-398 








435 440-441 488- 








489 500 510 518- 








520 532 569 590 








641-643 653 662- 








663 668 671-673 








682 


adult colon 


j. n vi u rog en 


r*T xrnm 

L-LUN U U X 


5 10 14 99 *lt: 47 - 








50 56 119-114 135 

DO X X <o XXI X J -J 








175-176 179 220 








230 254 256 289- 








290 308 332-333 








343 368 371 385- 








386 415 427-428 








436 465 498 510 ! 








518 534 572-573 








580 597-599 607 








643 651 661 663 








669 


aauj-L cervix 




LVAUul 


7 10 14 16 18 20 

/ X \J X^Z XU <1-U i< v 








23-96 30-31 40 47- 








4Q 56 69 66 70 73- 

*± 3D OD fU 13 








76 ft3 R5 R7 R9 93- 
/o 00 03 O/ ?j 








Q4 Q7 1 0*3 1 06 1 96 
y*i y 1 iuj xuo x^o 








131 137 141-142 








144 147-148 154 








175 177 179 182 








188-189 197-198 








202-203 206 211 








221 229 245 249 








259-263 267 282 








287 289 296 298 








302 305 308 314 








320 323-325 329- 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ ID NOS: 








333 350 356 358 








362 367-368 371 








377 382 384 390 








400 438 451 454 








459-460 462 465 








484 487-490 492- 








493 499-502 507 








516 522 524-525 








530 532 534-535 








541 550 555 572- 








573 580 587 602 








605 610 613-614 








616 623-624 626 








628 643 652 661 








663-664 668 680 








682 


diaphragm 


BioChain 


DIA002 


93 134 308 402 


endothelial 


Stratagene 


EDT001 


7 10 12 17 19 23 


cells 






29 34 36 39 52 54 








56 63-64 66 68 75 








80-84 86-89 92-93 








95-97 106-107 116- 








117 127 131 137 








139 147-148 150 








154 157-159 168- 








169 172 179 182 








192 198-199 202- 








203 208 211 215 








217 220-221 230- 








234 249 254 256 








259-262 264 270 








272 289-290 296 








298 313-314 316 








320 323-324 348- 








350 364 367 371 








376-377 390 392 








430 435 438 445- 








446 465 473-475 








484 487-489 492 








498-499 502 504- 








507 510 518 522 








524 532 541 543 








552 554-555 587- 








588 595 602 610 








631-632 643 651- 








654 662 668-669 








672-673 " 


fetal brain 


Clontech 


FBR001 


8 24 54 56 59 69 








88 229 384 428 








440-441 541 628 








671 


fetal brain 


Clontech 


FBR004 


20 53 160-161 170 








293 385 461 530 








605 620-621 654 








660 


fetal brain 


Clontech 


FBR006 


7-8 10 15 18-19 








24-26 29 33 46 53 








56 59 62-64 66 68 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ ID NOS: 








70 73 79 84 87 131 








140. 147-148 155 








163 165 170 179- 








180 189-190 208 








211 218-219 229- 








230 232-234 236 








245 249 259-262 








267 284-287 293 








298 305 308 313- 








314 316-319 322 








324 337-338 343- 








3.46 350-351 354 








359-362 376 380- 








381 384 387-398 








403-404 423-424 








428 431 435 438 








440-441 445-447 








451 462 473-475 








484 492 498-501 








504-507 509 512 








516 518-519 521- 








522 529-530 532 








541 543 550 554 








558 566 568-570 








576 591 597-599 








603 605 607-609 








623-625 627-632 








640 643 652-653 








662-663 665 667 








671-673 675 682 


fetal brain 


Clontech 


FBRS03 


17 371 


fetal brain 


Invitrogen 


FBT002 


7 10 29 43 47-49 








52 60 64-65 67-68 








79 83 86 92 94 131 








139-140 168-169 








180 2 02-203 205 








218-219 230 242- 








243 259-262 289 








296 298 302 305 








307 319 329-330 








332-333 364 380 








390 392-393 451 








473-474 484 492 








499-500 518 520 








537 553 607 619 








643 654 


fetal heart 


Invitrogen 


FHR001 


8 14-15 20 24-26 








34 37 39 46 53 56- 








57 60 63 70 75 80- 








82 96-98 101 106 








120 127 131 134 








153 161 168-169 








171 180 202-203 








216 229 236 266- 








267 289-290 303 








305 308 314 316 








325 344-345 356 








358-359 363 366 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ ID NOS: 








371 384 392-393 








395-398 400 402 




• 




419 422 431 434 








436 438 451 453 








461-462 478 484 








500 504-506 518 








522 525-526 530 








535 537 539 541 








550 570-573 586- 








588 590-591 597 








601 605 610 613- 








614 626 630-632 








640 643 652 669 








672-673 675 682 


fetal kidney 


Clontech 


FKD001 


26 62 96 106 115 








150 153 217-219 








259-262 289 308 








323-324 350 371 








428 435 507 522 








537 643 


fetal kidney- 


Clontech 


FKD002 


46 54 64 68 85 








107-108 126 131 








155 158-159 163- 








164 167-169 188 








224-226 229 232- 








234 236 245 282 








284-285 289-290 








293 298 340-341 








343 350 370 417- 








418 431 436 438 








461 484 499-500 








516 518 532 567- 








568 572-574 589 








596-599 613 624- 








626 628 640 671- 








673 


fetal kidney 


Invitrogen 


FKD007 


227 


fetal lung 


Clontech 


FLG001 


25 40 56 75 93 106 








112-114 131 229 








316 428 436 484 








499 572-573 623 


fetal lung 


Invitrogen 


FLG003 


5 7 10 16 22 25-26 








44 47-50 57 75 79 








102 106 148 157 








175-176 189 191 








256 259-262 314 








356 359 371 384 








399-400 423-424 








428 430 451 488- 








490 500 504-507 








518 529-530 534 








539 550 556 620- 








621 


fetal lung 


Clontech 


FLG004 


305 


fetal liver- 


Columbia 


FLS001 


1-5 7-8 10 12 14- 


spleen 


University 




17 19-20 24-27 29- 








54 56-57 62-64 68 








71 75 80-83 85 87- 
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TABLE 1 



Ticciia ffcricrin 


M-ilVk ul ji X\l JUUllC 


FTVS'E'O T .ihrarv Nam** 


SEQ 


ID NOS: 










Q7 


99-100 104-107 








XU 27 


-110 


131 


137 










-142 


150- 


-153 








J.jj 


168- 


16.9 


177- 








ion 

lOU 


183- 


184 


188 








J.270 


200 


202- 


-203 








one: 


208 


212 


215- 








o o n 


222 


229 


245 








ZOl 


-252 


256- 


-262 








OCA 


267 


271- 


-273 








A / b 


-279 


289- 


-290 










298 


302 


306 








t no 


314 


316- 


-318 








U 


324- 


325 


331- 










337- 


338 


349- 








*3 C O 


359 


364 


366- 








JOO 


371 


377 


383 








Job 


-387 


390 


392- 










400- 


401 


403- 








404 


420- 


421 


423- 








/in/i 

4.<s4 


428 


434 


-435 








4 Jo 


440- 


441 


445- 








a a a 
44b 


451 


455 


-457 








4oy 


-462 


475 


479- 








4 D 1 


484 


487 


491- 








4y^s 


498- 


507 


510- 








ci i 


516 


518 


521- 










526 


530 


533 








D J O 


-538 


541 


543 








550 


554- 


556 


558 








jdo 


593 


595 


-598 








OU1 


-602 


605 


607 








cn n 


613 


620 


-621 










-624 


629 


634 










-643 


651 


-652 








CC7 


-668 


671 


-673 








C*7C 


681 






z — ~ 

fetal liver- 


T \T* 

Columbia 


r LibU Uz 




7-8 


10 12 14- 


spleen 


uni v ejrsicy 




1 / 


19 24 


26 


-27 34 








*a c 
o b 


38 40-42 


44 47- 








4 Q 
4 y 


52-54 


56 


-57 62 








o4 


66 68 71 


75-76 








O U - 


83 85-86 


88-89 








y 1- 


93 96 98 


-100 








10b 


-108 


110 


112- 








11 J 


115- 


117 


128- 








1 31 


135 


137 


139- 








1 A O 

14Z 


150 


153 


157- j 








J. 2? 27 


163 


171 


-174 








179 


183- 


-184 


186 








188 


-189 


192 


198 








200 


202- 


•203 


206 








208 


212 


216 


218- 








220 


229- 


■230 


236- 








241 


245 


249 


252 








256 


-262 


275 


-279 








290 


294- 


•295 


298 








302 


305- 


•306 


308 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEO ID NOS- 








117-114 11fi-17n 

jl* OX 1 * jJLD-jZU 








174-17R 797 it; 

O^** OZO OZ/ OOO 








J J / JJO o4Z o4o- 








ic.n ic7_ifin o cc 
OoU jd / -jdu joo- 








ifiQ lofi ooo o q *7 
jdo o/o-o/o JO / 








oqn Ann 41 o Aon 

0 OU 4 UU 4li7-^^U 








^Tf A7Q AOA A1C 

4Zo 42o 4o4 4 Jo 1 








d "3 Q A/IO /l/ll AAA 
IJO 44U~44X 444 








/ cr ACO /ICQ /ICO 

4dd-4d/ 45o-4o2 








/no ii 01 /1 0 0 a a £ 
4 /o 4ol-4o2 486- 








AQO ^ QO /l Q Q CAT 

40*2 4yo-4yy 501 








cn^^ cnc cno cno 
5U4-5Uo jUo-ouy 








C 1 C CIO COT coo 

51o 51b 521-522 








527 530 534 536- 








CT7 C/IO CCA Tl C C 

53 / 543 554-555 








SfiA ^Rl ^R7-C?fin 
JOI Jul DO/ — OOO 








OOO DJ7 / "J JO DUX 








firm ci n fii 1 con- 

DUj OXU OXO OZU 








fi71 fi71_fi7t; fi77 








fi7Q fil1-fiO.O COA 








fi41-fi41 fim -fic;9 
OfrX"OfeO DDl-OOz 








fififi-fififl fi71- 

O O Z. OOO OOO O (1 








coo fi7C (Tp 1 } 

O / O O / D DOj 


fetal liver- 


Pol iinYhH a 


r no uuj 


0 C T A 1 Q on O/l O C 
2-5 14 Xo 2U 24 2o 




TTn 4 vp T - q H t* v 




44 co C/i cq on 0 0 
44 bz o4 00 oU-oj 








00 0 a qq inn 1 AC 

00 3 i oo-xuu lUo 








1*37 ICO 1C"7 l CO 
XO / lOJ 15 / loo 








1 Q O TOO OOO OOO 

loo iy / zZ2 Z2o i 








OOC O 4 c occ 00c 
2oo 245 25b 2/5- 








ooq 00.0 oqq one 
2/0 2oo 2oo JUb 








ll^-llfl 111 *3T7. 
0 J. .j 0 J. 0 jji 0 O / — 








*J*3Q OAfi l^fi ICQ 
OOO O *± O OOU j jj 








OOO O /X t15~uZU 








47R 4*36 41fl 4Q1- 

t<bO 1 JO IOO *± O X 








4 Q7 t;n9 RD7 m R 

*±;?Z JU/ JlO 








*"»7i -^99 t:*in sir 

•Jil D Z JJW OOO 








OOO 000 070 








fi71-fi74 fico CCO 
y^J OZ? OjZ OO/ 








672-67*5 679 


fefca.1 liver 


Invi trocren 


FIiVOOT 

f JJVUUl 


^ 1f> 94 AC CO CA 
O iU Z*i *tO jZ 04 








fi7-CR 1 C7 ICQ ICO 
O / DO ID/ lOO-lOO 








1 qa ono ono 0 1 1 
loU 2U2-2UO 211 








71 fi 71R-71Q 777 








717-741 7S.fi 7RQ- 
Aj / "Z"tl ZOO zoo 








7fi7 777 77fi-77R 
Z OZ Z / Z Z / 0 — Z / 0 








117-llfl 171 17A 
Jl /"JlO OZX OZ4 








117-111 1A7 1A7 
OIZ Oft/ 








•JC1 OO-l AA1 A71 
ODX o/X 4UX 4Z1 








428 434 451 488- 








490 498 593 623 








643 679 


fetal liver 


Clontech 


FLV002 


10 24 140 153 170 








230 249 256 275- 








278 284-285 325 








358 366 392-393 








500 518 538 576- 








577 613 623 641- 








642 666 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEO Library Name 


SEO ID NOS' 


fetal liver 


Clontech 

V* ^ Vii \* WWII 


PLV004 


5 13-14 18 90 24 








35 46-50 56 63-64 








68 75 100 109 106 

OO I .J X. VJ Vj X Z» X U u 








108 116-118 137 








140 144 147-148 








170-172 218-219 








236 2*56 259-962 








975-978 718 797 








795 799-770 740- 








341 356-357 371 








390 428 431 436 








438 440-441 453 








461-462 498-499 








518 530 537-538 








543 587-588 623 








629 632 638-639 








643 651-652 662 








666 671-673 


"F ca t~ 13 1 mi 1 gpI P 


xixvitt uycu 




5 16 94-96 64 97 

X) JLO 0*± 7 J 








139 1 44 168-1 69 








171 175-176 181 

J. / -1- 1 -J JL. 4 VJ -J- w ^ 








209-907 919 918- 








219 256 289-290 








296 298 317-318 








349 756 764 371 

— > " ^ XJ _J VJ J U i -J / _L 








377 380 392-393 








402-404 427 444 








518 523 564 586 








623 661-662 


■p is t" a 1 mi ispI p 


Tnvi h Tnnpn 
v iij, uycii 


IT 1'IQ \J \J 


6 m-1fi 91 96 99 

O xj lO Z -L iiO Xi X7 








77 41 59 57 7*5 87 

«J / *± i XJ o XJ / / XJ O / 








96 1 01 -1 09 1 06 

X? D 1U1 1UZ 1UO 








116-118 171 1 5R - 

llu HO IJl XJO 








1 59 1 67-1 69 1 71 
J. xj id i xo j j. / j. 








1 80 1 89 956-958 

XOU J- O -7 X3 O — Xi J O 








979 989-990 997 








998 706 708 716 

£, J O J VJ VJ X>wO J XU 








325 332-333 343 

-J 41i _» J J J J -J ™ ~J 








7S1 357 356 380 








382 388 400 402 








411 416 419 428- 








429 431 453 499 








516 522 525 530 








532 541 543 550 








563 565-568 572- 








573 584 586 603 








613 623 643 669- 








663 


fetal skin 


Invitrogen 


FSK001 


5 7-8 10 14-17 20 








23 25-26 29 36-37 








39 41 46 51 53 68- 








70 80-82 84 86 90 








92-93 96 111 127- 








130 132-133 141- 








142 147-148 151- 








152 158-161 163- 








165 173-174 202- 








203 205-207 218- 
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TABLE 1 



Tissue Origin 



Library/RNA Source HYSEQ Library Name 



SEQ ID NOS; 



220 224- 
230 245 
262 289- 
298 302 
309 315- 
324-325 
359 364 
388 392- 
405-413 
436 438 
451 458- 
472 476 
499 518- 
530-532 
558 564 
583 591 
617-618 
623 627 
659-663 
680 682 



226 229- 
254 256- 
290 296 
305 308- 
316 319 
327 358- 
369 371 
393 400 
417-418 
440-441 
465 467- 
487 492 
520 525 
547-549 
571 580 
607 610 
620-621 
643 652 
671-673 



fetal skin 



Invitrogen 



FSK002 



5 10 16 18 
26 36-37 39 
52-53 56 61 
70 80-83 87 
100 130-131 
158-159 162 
168-169 182 
193 201 220 
226 229 235 
245 249 254 
262 289-290 
298 302 316 
325 331-333 
340-341 350 
361 363-364 
390 392-398 
403-404 408 
411 417-418 
428 431 436 
441 451 453. 
464-465 467 
476 478 484 
502 504-506 
516 518 521 
530 532 541 
547-549 556 
565 568 587 
591 593-594 
598 613-614 
624-625 629 
632 637 640 
652 662 667 
671-673 681 



20 23- 
41 46 
-65 68 
94 96 
148 
-164 
188 
224- 
-241 
257- 
293 
318 
335 
359 
371 
400 
-409 
422 
440- 
.462 
471 
499 
512 
-522 
543 
564- 
589- 
597- 
616 
631- 
643 
669 
-682 



fetal spleen 



BioChain 



FSP001 



26 87 371 461 667 



umbilical cord 



BioChain 



FUC001 



5 18 20 26 40 47- 
49 70 72 83 86-87 
93 96 106 110-111 
116-117 124 126- 



WO 2004/080148 



PCT/US2003/030720 



131 
TABLE 1 



x issue urigm 


jLfiDrary/ivii a ooiircc 


HYSEQ Library Name 




IDNOS: 












XZ 1 


134 


144 


±DZ - 










1 CO. 


155 


157 


1 C Q 










JLbl 


165 


171 












one 
zUb 


218- 


219 


ZZQ - 










o o e 


229 


243 


o^o 
<s4 / 












256 


259 


o fo 
- ZbZ 










289 


296 


298 


303 










*5 A C 

305 


-306 


308 


314 










316 


325 


332 


-333 










337 


-338 


344 


-345 










349 


352 


359 


364 










J I L 


394- 


398 












417 


-421 


427 


431 










436 


438 


453 


473- 










474 


477 


479 


499- 










500 


507 


512 


522 










525 


535 


537 


565 










593 


595 


613 


620- 










621 


623- 


624 


637 










643 


653- 


654 


660- 










661 


668-669 


682 


fetal brain 


GIBCO 


HFB001 




5 10 18- 


21 27 34 










38- 


40 47 


-49 


52 56- 










60 


62 64 


66 


-70 72- 










76 


B0 83 


86 


92-93 










134 


139 


141 


-142 


* 








149 


-150 


155 


170 










172 


179- 


180 


185- 










186 


188 


202 


-203 










205 


207 


209 


-212 










216 


229- 


230 


256 










286 


-287 


289 


294- 










296 


298 


314 


319- 










320 


323 


325 


337- 










338 


346 


350 


357 










367 


371 


376 


381 










384 


420 


436 


438 










444 


447 


454 


459- 










462 


475 


484 


487 










492 


-493 


499 


-500 










507 


518- 


519 


522 










529 


-530 


532 


534 










541 


543 


563 


570- 










571 


580 


597 


-598 










601 


607 


616 


619- 










621 


623- 


624 


643 










653 


-654 


662 


664 










668 


671- 


•673 


675 










677 


-678 


682 




macrophage 


Invitrogen 


HMP001 




18 


26 43 64 


118 










144 


179 


211 


245 










329 


-330 


347 


371 










427 


435 


461 


502 










53 0 


537 


620 


-621 










635 


638- 


639 




infant brain 


Columbia 


IB2002 




7 14 16- 


•17 


21 23 




University 






25- 


26 29 40 


47-50 










56- 


57 59-60 


64 67- 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQIDNOS: 








68 70 73-74 79 83 








88 91-92 94 98 103 








115 127 137 139 








150-152 156 158- 








159 161-163 173- 








174 182 186 188- 








189 197 202-203 








205-215 230 245 








259-262 264 268 








280 285 289 296 








298 305 307-308 








313-316 319 322- 








324 326 334 346- 








347 349-351 359 








363-364 367 371 








376-377 390 420 








431 436 438 444 








447-449 451 453 








461-462 479 487 








492 498-501 504- 








506 516 519 522 








529-530 537 541 








543 545 556 564 








572-573 588 592- 








593 597-598 600 








604-605 607 610 








619 622 624 627- 








628 643 652-654 








660 663 674-675 








682 


infant brain 


Columbia 


IB2003 


7 10 16 19-20 25 




University 




29 35 43 46-50 56- 








57 59-60 64 68 70 








79-82 87 92 106 








139 150 158-159 








162-163 165 173- 








174 181 186 189 








202-203 205 210- 








214 229-230 245 








256 259-263 289- 








290 298 305 307- 








308 314-315 319 








322 328 334 337- 








338 347 349 351 








359 364 371 380 








385 428 436 438 








444 447 449 451 








462 475 484 487 








492-493 498-502 








519 522 529-530 








532 537 540 550 








556 593 602-605 








607 616 622 627 








631-632 643 652- 








654 663 672-673 








682 


infant brain 


Columbia 


IBM002 


47-50 84 151-152 




University 




157 188-189 209 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ 


ID NOS: 










289 


390 


423- 


424 








453 


628 






infant brain 


Columbia 


IBS001 


10 


16 29 


46- 


50 56 




University 




58 


67 78 


80- 


82 156 








163 


186 


259- 


262 








285 


305 


315 


334 








349 


452 


488- 


489 








522 


532 


540 




lung, 


Stratagene 


LFB001 


5 7 


16 19 4C 


54 56 


fibroblast 






61- 


52 68 


83 


93 106 








116 


121 


137 


172 








191 


198 


205 


223 








256 


289 


325 


329 








349 


371 


400 


438 








484 


501- 


502 


507 








518 


522 


525 


532 








541 


610 


631- 


632 








643 


651 


669 




lung tumor 


Invitrogen 


LGT002 


5-7 


10 15-16 18-19 








26 


29 34 


-36 


38 40- 








41 


46-50 


52 


56 59 








64 


68 75 


86 


89 91- 








96 


103-106 112-114 








116 


-117 


120 


128- 








130 


135 


141- 










144 


147- 


148 


150 








154 


-155 


157- 


-159 








162 


-164 


172- 


174 








179 


-180 


190- 


192 






■ 


198 


202- 


203 


208 








215 


220- 


221 


223 








229 


236 


249 


255- 








258 


263 


271 


275- 








278 


284- 


285 


291- 








292 


296 


302 


309 








314 


316 


319 


323 








327 


331 


342 


349- 








351 


353 


358 


364 








368 


-369 


371 


390 








392 


-393 


399- 


■400 








420 


-421 


427 


431 








436 


438 


444 


453- 








454 


459- 


462 


465 








470 


484 


486 


488- 








492 


499- 


500 


502 








507 


511 


518 


522 








525 


-526 


530 


537 








539 


543 


550 


580 








597 


-599 


605 


623- 








625 


627 


637 


643 








652 


661- 


662 


665- 








666 








lymphocytes 


ATCC 


LPC001 


13 


16 18 20 


27 43 








47- 


49 54 62- 


•64 66- 








68 


80 87 90 


96 98 








115 


118 


120 


131 








144 


163 


202- 


•203 








211 


252 


256 


259- 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ 


ID NOS: 










262 


265 


290 


296 








308 


324- 


325 


347 








350 


358 


371 


377 








384 


400 


420 


428 








436 


462 


467 


470 








483 


-487 


499 


-502 








504 


-507 


509 


518 








522 


525 


530 


543 








545 


550 


588 


600 








605 


607 


624 


-625 . 








633 


635 


643 


645 








654 


669 


672 


-673 








675 








leukocyte 


GIBCO 


LUC001 


10 


16 18 


24 


34 38- 








40 


43-44 


47 


-50 52 








54- 


57 62 


-64 


66 68 








78 


80-82 


86 


-89 93- 








94 


98 106 109 Ill- 








120 


ill 

-L -J X 


134 


137 








139 


144 


150 


-152 








154 


163 


165 


177 








179 


186 


189 


198 








202 


-203 


208 


211 








218 


-219 


221 


229 








236 


247 


249 


252 








256 


259- 


264 


270 








275 


-278 


289 


-290 








298 


302 


305 


308 








315 


317- 


318 


323 








325 


328 


337 


-338 








342 


347 


350 


358 








364 


368 


371 


390 








392 


-393 


421 


427- 








428 


430 


433 


-435 








437 


-438 


440 


-441 








444 


451- 


452 


454 








461 


475 


484 


-487 








491 


493 


498 


-500 








502 


504- 


507 


509 








518 


-519 


522 


525- 








526 


530 


535 


541 








543 


550 


555 


586- 








588 


597- 


598 


605 








607 


610 


620 


-621 








624 


627 


631 


-633 








638 


-639 


643 


652 








654 


668- 


669 


672- 








673 


675- 


676 




leukocyte 


Clontech 


LUC003 


20 


47-49 52 


56 100 








112 


-114 


198 


-199 








314 


337- 


338 


348 








371 


438 


484 


502 








530 


537 


602 


633 








643 








melanoma from- 


Clontech 


MEL004 


14 


25 34 


47 


-49 56 


cell-line-ATCC- 






64 


66 83 


92 


106 


#CRL-1424 






111 


131 


134 


137 








139 


150 


162 


173- 
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TABLE 1 



Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ CD NOS: 








174 189 192 210 








229 249 259-262 








290 321 337-338 








350 364 371 392- 








393 438 440-441 








444 475 493 499 








507 554 587 643 








651 667 669 671 


mammary gland 


Invitrogen 


MMG001 


5 7 10 16-17 19 25 








46-53 56 64 68 70 








79-82 85-86 89 92- 








95 98-100 106 121 








127 137 139-142 








144 150-152 158- 








159 161-164 180 








189 192-193 198 








202-203 205-206 








216 218-220 230 








245 249 252 259- 








263 267 270-272 








275-278 289-290 








298 302 305 308 








313 315 319 324 








329-330 336 346 








349 351 355-356 








359 364 368 370- 








371 377 384 390 








392-393 421 425 








427-428 436 444 








451 455-460 462 








465 473-474 487 








492 499 502 507 








516 518 524-526 








529-530 533-534 








539 543 583 590 








592 602 605 613 








623 627 631-632 








643 646 660 677- 








678 682 


induced neuron- 


Stratagene 


NTD001 


17 20 23 68 79 89 


cells 






153 155 181-182 








212 218-219 235 








298 346 352 358 








376 438 478 484 








488-489 492-493 








499-501 541 570 








619 627 643 662 








672-673 


retinoic acid- 


Stratagene 


NTR001 


7 23 56 68 70 131 


induced - 






186 189 213-214 


neuronal-cells 






290 293 342 461 








499 504-506 530 








601-602 607 682 


neuronal cells 


Stratagene 


NTU001 


7 29 42 68 70 84- 








85 92 131 140 147- 








148 202-203 259- 








262 305 316 319 








336 371 395-398 
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TABLE 1 



Tissue Origin 


Library/RJNA Source 


HYSEQ Library Name 


SEQ 


IDNOS: 










461 


493 


499 


502 








537 


550 


553 


592 








652 


672- 


673 


682 


pituitary gland 


Clontech 


PIT004 


2-4 


47-49 5 


5 68 72 








93 137-138 141-142 








i cn 
lOu 


T C.A 


1 


.1 CO 








Iff 


LoA 


1 QO 

Jiy A 


AZX 








A Ay 


A l A " 


oil 

A i J 


AyU 








z y o 


*3 ft Q 


*51 C 
Olo 










TOO 


oil 


5*±A 


~1 A C 








ICC 

J bo 


o b U 


4 J o 


ACQ 

4oy - 








4oU 


A /TO 

4 oz 


A "7*2 

4 / J 


ATA 

-4 /4 








A Q A 

4 o4 


c ft a 
5U4 - 




CO A 

DZ4 










t> J 4 


04 -L 










Oo4 


b Z 3 


djX 










Ojj 


£ A *3 

O 7 J 


662 




r>l arpnt*a 


Oloni" pch. 


PLA003 


J. - b 


/ -LZ 


26 


37 41 








c o 


o4 /a 


85 


87 96 








in/* 

1 U b 


- ±U / 


112 


-114 








111 


1 C T 
iDl- 


152 


157 








O O *3 
£ J J 


O "3 c 
A Jo 


256 


-262 








O U J 




316 


335 








1 c. ft 




359 


371 








Ann 


A O Q 
4 A O 


431 


435 








A "3 Q 
4 Jo 


A A C, 
44j- 


446 


462 








A QQ 




516 


520 








530 


532 


537 


543 








550 


556 


565 


579 








587 


594- 


595 


626 








635 


638- 


639 




prostate 


Clontech 


PRT001 


20 


25 56 


173-174 








205 


250 


256 


280 








284 


-285 


299 


302 








309 


320 


323 


-324 








331 


342 


349 


362 








367 


384 


386 


392 








400 


415 


438 


484 








498 


507 


524 


532- 








534 


590 


620 


-621 








623 


631- 


632 


654 








677 


-678 


680 




rectum 


Invi t rogen 


REC001 


7 10 20 


47- 


50 52 








85- 


B7 89 109-110 








126 


128- 


130 


157 








163 


170 


173 


-174 








177 


205 


220 


229 








256 


259- 


262 


289 








319 


324 


327 


340- 








341 


347 


364 


368 








371 


377 


415 


-416 








423 


-424 


427 


436 








465 


504- 


506 


581- 








582 


602 


610 


679 


salivary gland 


Clontech 


SAL001 


5 10 22 


25 


43 52 








63- 


64 67 89 


95 97 








99 


137 140 


161 165 








167 


-169 


180 


205 








229 


252 


256 


290 
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Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ ID NOS: 








323 351 368 371 








430 436 438 487 








502 507 516 525 








564 580 613 617- 








618 631-632 


saliva gland 


Clontech 


SALs03 


20 


skin fibroblast 


ATCC 


SFB001 


208 


skin fibroblast 


ATCC 


SFB002 


208 


small intestine 


Clontech 


SIN001 


5 7-8 10 15 24 26 








37-38 47-49 51-54 








56-57 59 64 67-68 








72 75 88 93 96-97 








100 106 108 111 








116-117 121 128- 








131 137 140 153 








158-159 177 189 








191 202-203 206 








215 229 253 255- 








256 259-262 264- 








265 272 280 296 








300-301 308-309 








316-318 325 327 








332-333 335 337- 








338 344-345 347 








352 359 368 371 








386 390 392-393 








423 431 435 438 








444 462 479 484 








492 507 509 522 








525-526 532 534 








550 572-573 581 








593 605 620-621 








623 628 632 643 




• 




650 652-654 672- 








673 


skeletal muscle 


Clontech 


SKM001 


5 62 101 104 134 








165 254 272 289 








300-301 308 316 








323 356 377 402 








428 431 438 444 








451 462 541 543 








550 572-573 586 


skeletal muscle 


Clontech 


SKM002 


208 507 


spinal cord 


Clontech 


SPC001 


13 15 26-27 33-34 






38-40 46-50 52-53 








56 68 80-82 87 89 








92-95 131 150 155 








163 175-176 180 








186 197 199 202- 








203 205* 211 213- 








214 229 231 235 








254 263 289 307 








311 314-316 323- 








324 329 340-342 








348-349 352 359 








364 371 384 400 








438 451 484 493 








500 507 509 511 
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Tissue Origin 


Library/RNA Source 


HYSEQ Library Name 


SEQ ID NOS: 








516 


522 


525 


530 








532 


537 


562 


-563 








567 


-568 


580 


595 








597 


603 


607 


610 








612 


-613 


616 


620- 








622 


627 


643 


653 








672 


-673 


675 


677- 








678 








adult spleen 


Clontech 


SPLcOl 


7 9 


13 


17 2 


6 37 43 








64 


75 106 112-114 








118 


131 


163 


212 








216 


218 


-219 


256 








259 


-262 


308 


314 








"3 O Q 
J Z y 


-330 


349 


368 








390 


392 


-393 


422- 








424 


427 


431 


435- 








436 


451 


453 


484 








500 


-501 


509 


525 








530 


532 


535 


-536 








541 


592 


600 


610 








613 


623 


628 


631- 








632 


635 


645 


654 








663 


668 


672- 


-673 








679 








bone marrow 


null 


STM001 


7 43 162 252 256 








305 


371 


427 


438 








530 


607 


651 


658 


stomach 


Clontech 


STO001 


67 93 95 135 230 








259- 


-262 


284- 


-285 








289 


302- 


-303 


308 








320 


323 


390 


392- 








393 


420 


428 


436 








484 


507 


524- 


•525 








530 


536 


587 


631- 








632 


637 






thalamus 


Clontech 


THA002 


10 18 24 


33 


47-50 








54 5 


8 60 68 


90 92- 








93 98 100 102 160- 








161 


180 


205 


208 








229- 


230 


242 


259- 








262 


272 


296 


302- 








305 


325 


331 


342 








359 


384 


386 


390 








425 


511 


532 


543 








572- 


573 


587 


602 








608- 


610 


612 


616 








620- 


621 


631- 


632 








660 








thymus 


Clontech 


THM001 


5 12 


39- 


40 43 47- 








50 54 56 


66 


68 70 








79 8 


7-88 


93 


106- 








107 


131 


135 


144 








162 


173- 


174 


177 








192 


198 


205 


211 








218- 


219 


229 


256 








281 


289- 


290 


293 








306 


308 


314 


317- 








318 


321 


323 


325 
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Tissue Origin 



Library/RNA Source HYSEQ Library Name 



SEQ ID NOS: 



thymus 



Clontech 



THMc02 



thyroid gland 



Clontech 



THRO 01 



331-333 347 349- 
352 368 371 384 
389 420 425 438 
440-441 484 487 
493 498-499 502 
509 530 532 541 
554-555 558 597- 
599 610 613 616 
620-621 624 643 
671-673 682 



5 8 10 12 25 32 34 
37 39 43 45-46 48- 
50 53 55-56 61 63 
65-67 70 83 85 87- 
88 94 106-107 112- 
114 116-118 120 
131 135 140-142 
144 1S0-152 158- 
159.163-165 179 
189 208 229 232- 
234 256 259-262 
273 289-290 302 
305 316-318 324- 
325 335 349 361 
363-364 371 384 
389 392-393 421- 
424 437-441 443 
445-446 451 459- 
461 473-474 498 
500 504-507 509 
518 522 526 530 
541 554 564 583 
592 600 607 610 
613 624-625 627 
630-632 634 637 
643-645 651 667 
669 671-673 682 



6 14-15 19 
32 34 39-40 
56 61-63 66 
75 87 93 95 
104-106 115 
131 137 141 
154 157 162 
168-169 175 
182 189 191 
202-203 211 
219 221 229 
234 249 254 
282 289-290 
302 306-308 
316 323-324 
329-330 342 
353-358 368 
377 380 383 
400 423-424 
431 436-438 
441 446 451 



26 29 

47-52 
-68 72 
100 
128- 
-142 
165 
177 
-193 
217- 
231- 
256 
298 
314- 
327 
350 
371 
-384 
426 
440- 
459- 
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Tissue Origin 


Library/RNA Source 


HYSEO I Jhrarv Nam a 


SEO ID NOS: 








461 


475 


478 


484 








487 


-489 


491 


-492 








499 


-500 


502 


-506 








509 


518 


-519 


521- 








522 


530 


532 


-533 








541 


543 


567 


-568 








586 


588 


597 


-600 








605 


607 


610 


617- | 








618 


620 


-621 


624- 








626 


631 


-632 


635 








643 


651 


654 


662 








668 


671 


-672 


680 


trachea 


Clontech 




7 22 38 


40 56 68 








83 94 229 259-262 








289 


296 


298 


360 








371- 


-375 


438 


484 








499 


511 


521 


541 








571- 


-573 


588 


613 








624 


627 






uterus 


Clontech 




17 36 70 76 


103 








106 


109 


112- 


•114 








131 


150 


157 


179- 








180 


189 


290 


296 








308 


314 


320 


329- 








330 


356 


364 


366 








368 


390 


395- 


398 








415 


438 


447 


507 








509 


519 


525 


529 








532 


564 


620- 


621 








631- 


632 


662 


668- 








669 


682 







*The 16 tissue/mRNAs and their vendor sources are as follows: 1) Normal 
adult brain mRNA (Invitrogen), 2) Normal adult kidney mRNA (Invitrogen), 
3) Normal fetal brain mRNA (Invitrogen), 4) Normal adult liver mRNA 

(Invitrogen) , 5) Normal fetal kidney mRNA (Invitrogen) , 6) Normal fetal 
liver mRNA (Invitrogen) , 7) normal fetal skin mRNA (Invitrogen) , 8) human 
adrenal gland mRNA (Clontech) , 9) Human bone marrow mRNA (Clontech) 10) 
Human leukemia lymphoblastic mRNA (Clontech) , 11) Human thymus mRNA* 

-I* f 12) hUman lymph node mRNA <Clontech), 13) human so\spinal cord 
"SI, » "5 ech | ' 14) human th y r oid mRNA (Clontech) , 15) human esophagus 
mRNA (BioCham) , 16) human conceptional umbilical cord mRNA (BioChain) 
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in 
in 


Hit ID 


Species 


Description 


score 


jrerceniage 
laeniiiy 


Ooj 




nomo sapiens 


cnonomc bomaiomai m noiropin \-»o- d 




1 nn 


685 


gil81127 


Homo sapiens 


chorionic somatomammotropin precursor 


275 


96 


685 


gil83153 


Homo sapiens 


chorionic somatomammotropin CS-2 


275 


96 


686 


gil83178 


Homo sapiens 


hGH-V2 


1033 


78 


686 


gil83153 


Homo sapiens 


chorionic somatomammotropin CS-2 


710 


87 


686 


gi3 87024 


Homo sapiens 


placental lactogen hormone precursor 


710 


87 


688 


gil83178 


Homo sapiens 


hGH-V2 


1051 


79 


688 


gil8H2l 


Homo sapiens 


chorionic somatomammotropin 


788 


95 


688 


gil8315l 


Homo sapiens 


chorionic somatomammotropin CS-1 


788 


95 


689 


gil265350l 


Homo sapiens 


Similar to serine (or cysteine) proteinase 
inhibitor, clade F (alpha-2 antiplasmin, 
pigment epithelium derived factor), 
member 1 


1242 


99 


689 


gil5217079 


Homo sapiens 


pigment epithelium-derived factor 


1242 


99 


689 


gi 189778 


Homo sapiens 


pigment epithelial-differentiating factor 


1242 


AA 

99 


690 


gil7128288 


synthetic 
construct 


Primer 1 


1150 


99 


690 


gi20269957 


Sus scrofa 


phospholipase C delta 4 


1033 


88 


690 


gi2l307610 


Mus musculus 


t If g~\ J 1a A 

phospholipase C delta 4 


909 


77 


691 


gil7864023 


Homo sapiens 


KCCR13L 


3524 


100 


691 


gi21483462 


Drosophila 
melanogaster 


LD44686p 


533 


36 


691 


gi21741717 


Oryza sativa 


oj99lll3_30.22 


127 


29 


692 


gil742S818 


Ralstonia 
solanacearum 


GALA PROTEIN 3 


117 


32 


692 


gi2 1 536497 


Arabidopsis 
thaliana 


F-box protein family, AtFBL4 


115 


1 A 

30 


692 


gi 1258 1 504 


Trypanosoma 
brucei 


GUI 


i if 
1 1 J 


33 


£Q1 

693 


gl43 /DO 2 


Oryctolagus 
cuniculus 


— : — — — 

interleukm-8 receptor subtype B 


1 QA 


01 


693 


gi 186378 


Homo sapiens 


interleukin 8 receptor B 


178 


57 


/COT 


gU 1 0 V 1 


Homo sapiens 


interleuldn-8 receptor type B 


1 78 

1 IO 


J 1 


694 


gi3335098 


Homo sapiens 


CU39L2 




1UU 


694 


gil 1230487 


Rattus 
norvegicus 


N 1 JrDaseo 


ZUOj 


OO ! 


694 


gi5139519 


Mus musculus 


nucleoside diphosphatase (ER-UDPase) 


1008 


53 


695 


gi21928620 


Homo sapiens 


seven transmembrane helix receptor 


1 oco 

1858 


1 AA 

100 


695 


gil6566319 


Homo sapiens 


G protein-coupled receptor 


1843 


99 


695 


gi6644328 


Rattus 
norvegicus 


orphan G protein-coupled receptor 
GPR26 


822 


50 


696 


gi7H02l6 


Homo sapiens 


C-type lectin-like receptor- 1 


851 


99 


696 


gi7l0973l 


Homo sapiens 


C-type lectin-like receptor-2 


256 


31 


696 


gi2038l202 


Mus musculus 


Similar to C-type (calcium dependent, 
carbohydrate recognition domain) lectin, 
superfamily member 12 


196 


27 


697 


gi22449809 


Chaoborus 
trivitattus 


cytochrome oxidase I 


50 


44 j 


697 


gi235l328 


Newcastle 
disease virus 


fusion protein 


59 


44 


697 


gi2l3il450 


Galleria \ 
mellonella 


antifungal peptide gallerimycin 


55 


33 


698 


gil8089247 


Homo sapiens 


Similar to ectonucleoside triphosphate 
diphosphohydrolase 5 


2104 


100 
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698 


gi3335102 


Homo sapiens 


/^T\1 AT A 

CD39L4 


2104 


100 


698 


gi 15076827 


Homo sapiens 


Pcph proto-oncogene protein 


2090 


99 


699 


gi 15 1242 


Pseudomonas 
aeruginosa 


heat shock protein 


79 


38 


699 


gi9950616 


Pseudomonas 
aeruginosa 


GroES protein 


79 


38 i 


699 


gi2564287 


Pseudomonas 
stutzeri 


HsplO protein 


79 


44 


701 


gi20521055 


Homo sapiens 


Start codon is not identified 


724 


32 


701 


gil7225457 


Homo sapiens 


autism-related protein 1 


676 


32 


701 


gil5145797 


Sus scrofa 


basic proline-rich protein 


156 


27 


702 


gi208 10589 


Homo sapiens 


similar to arsenite inducible RNA 
associated protein 


833 


99 


702 


gi9651711 


Mus musculus 


arsenite inducible RNA associated protein 


687 


80 


702 


gi!7390981 


Homo sapiens 


Similar to RIKEN cDNA 1 1 10060018 
gene 


535 


59 


703 


gi6624130 


Rattus 
norvegicus 


similar to 45 kDa secretory protein ; 


2150 


100 


703 


013241652 


Rattus 
norvegicus 


supernatant protein factor 


2040 


93 


703 


gil9548982 


Bos taurus 


tocopherol-associated protein 


1930 


90 


704 


gil3 177766 


Homo sapiens 


Similar to presenilins associated 
rhomboid-like protein 


1761 


99 


704 


gil5559382 


Homo sapiens 


presenilins associated rhomboid-like 
protein 


1094 


98 


704 


gi7959883 


Homo sapiens 


PRO2207 


671 


82 


705 


gil864091 


Rattus 
norvegicus 


PSD-95/SAP90-associated protein-3 


5005 


95 


705 


gi2454510 


Homo sapiens 


PSD-95/SAP90-associated protein-2 


1338 


55 


705 . 


gi6979173 


Homo sapiens 


discs, large (Drosophila) homolog- 
associated protein 2 


1011 


45 


706 


gill 877274 


Homo sapiens 


dJ726C3.2 (novel protein) 


2260 


99 


706 


gi21667210 


Homo sapiens 


bactericidal/permeability-increasing 
protein-like 1 


2260 


99 


706 


gi20387087 


Oncorhynchus 
mykiss 


LBP (LPS binding protein)/BPI 
(bactericidal/permeability-increasing 
protein) like-2 


349 


26 


707 


gi7291716 


Drosophila 
melanogaster 


CG11388-PA 


648 


39 


707 


gi 16768 190 


Drosophila 
melanogaster 


GH22974p 


647 


39 


707 


gi3954938 


Homo sapiens 


acetylglucosaminyltransferase-like 
protein 


171 


23 


708 


gil4334082 


Mus musculus 


thymus LIM protein TLP-A 


479 


87 


708 


gi 14334084 


Mus musculus 


thymus LIM protein TLP-B 


397 


79 


708 


gi487284 


Rattus 
norvegicus 


CRP2 (cysteine-rich protein 2) 


367 


75 


710 


gi556299 


Mus musculus 


alpha-2 type IV collagen 


8129 


83 


710 


gi30076 


Homo sapiens 


alpha-2 chain precursor (AA -25 to 1018) 
(3416 is 2nd base in codon) 


5916 


100 


710 


gi 15991848 


Homo sapiens 


A type IV collagen 


4239 


51 


711 


gi7861733 


Homo sapiens 


low density lipoprotein receptor related 
protein-deleted in tumor 


2583 
1 


99 


711 


gi8926243 


Mus musculus 


low density lipoprotein receptor related 
protein LRP1B/LRP-DIT 


2409 
6 


91 
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711 


gi438007 


Gailus gallus 


alpha-2-macroglobulin receptor 


1419 
7 


63 


712 


gi 172983 15 


Homo sapiens 


candidate tumor suppressor protein 


848 


100 


712 


gi7861733 


Homo sapiens 


low density lipoprotein receptor related 
protein-deleted in tumor 


848 


100 


712 


gi8926243 


Mus musculus 


low density lipoprotein receptor related 
protein LRP1B/LRP-DIT 


731 


83 


713 


gil6877754 


Homo sapiens 


Similar to RIKEN cDNA 4930434H03 
gene 


574 


56 


713 


gi20071811 


Mus musculus 


Similar to RIKEN cDNA 4930434H03 
gene 


493 


60 


713 


gi!340174 


Homo sapiens 


type III procollagen (aa 892-1023) 


97 


40 


714 


gil57409 


Drosophila 
melanogaster 


fat protein 


1802 


31 


714 


gi4887715 


Drosophila 
melanogaster 


adherin 


1500 


36 


714 


gi 1107687 


Homo sapiens 


homologue of Drosophila Fat protein 


1514 


30 


715 


gi 157409 


Drosophila 
melanogaster 


fat protein 


1808 


31 


715 


gi4887715 


Drosophila 
melanogaster 


adherin 


1500 


36 


715 


gill 07687 


Homo sapiens 


homologue of Drosophila Fat protein 


1514 


30 


716 


gil786531t 


Homo sapiens 


dipeptidyl peptidase-like protein 9 


2562 


99 


716 


gi35 13303 


Homo sapiens 


R26984 I 


2700 


98 


716 


gill095188 


Homo sapiens 


dipeptidyl peptidase 8 


1397 


53 


717 


gi2689444 


Homo sapiens 


ZNF134 


1160 


54 


717 


gi2 13 14977 


Homo sapiens 


Similar to zinc finger protein 17 (HPF3, 
KOX 10) 


1038 


51 


717 


gil3543419 


Homo sapiens 


Similar to zinc finger protein 304 


1000 


51 


718 


gi7582294 


Homo sapiens 


BM-011 


881 


100 


718 


gil3937769 


Homo sapiens 


Similar to RIKEN cDNA 12000 13F24 
gene 


781 


98 


718 


gil78997 


Homo sapiens 


arginine-rich nuclear protein 


224 


38 


719 


gil620870 


Ciona 
intestinalis 


myoplasmin-Cl 


412 


28 


719 


gi74 16980 


Argopecten 
irradians 


myosin heavy chain catch (smooth) 
muscle specific isoform 


279 


23 


719 


gi74l6982 


Argopecten 
irradians 


myosin heavy chain cardiac muscle 
specific isoform 1 


279 


23 


720 


gil3872813 


Homo sapiens 


fibulin-6 


1376 
4 


100 


720 


gil4575679 


Homo sapiens 


hemicentin 


1372 
0 


99 


720 


gi3328186 


Caenorhabditis 
elegans 


hemicentin precursor 


1695 


30 


721 


gi3822553 


Gallus gallus 


nuclear calmodulin-binding protein 


1492 


64 


721 


gi3329496 


Mus musculus 


heterogenous nuclear ribonucleoprotein U 


1501 


45 


721 


gi624918 


Rattus 
norvegicus 


SP120 


1498 


45 


722 


gi 17223626 


Homo sapiens 


ATP-binding cassette A 10 


7966 


99 


722 


gi 17223624 


Homo sapiens 


ATP-binding cassette A9 


5160 


61 


722 


gil7223622 


Homo sapiens 


ATP-binding cassette A6 


5108 


61 


723 


gil3374079 


Homo sapiens 


TAFII 140 protein 


3677 


99 


723 


gil3374178 


Mus musculus 


TAFII140 protein 


3202 


84 
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723 


gi205686 


Rattus 
norvegicus 


heavy neurofilament subunit 


335 


26 


724 


gil7429038 


Ralstonia 
solanacearum 


PROBABLE ACYL-COA 
DEHYDROGENASE 
OXIDOREDUCTASE PROTEIN 


661 


61 


724 


gi9948609 


Pseudomonas 
aeruginosa 


probable acyl-CoA dehydrogenase 


619 


62 


724 


gil3421911 


Caulobacter 

crescentus 

CB15 


acyl-CoA dehydrogenase family protein 


559 


59 


725 


gi6752658 


Homo sapiens 


epidermal growth factor repeat containing 
protein 


3055 


99 


725 


gil 6040981 


Mus musculus 


POEM 


884 


51 


725 


gi 15430246 


Mus musculus 


nephronectin short isoform 


884 


51 


726 


gi6531661 


Caenorhabditis 
elegans 


LIN-41A 


844 


50 


/2o 


gio531oo3 


Caenorhabditis 
elegans 


LIN-41B 


844 


50 


72o 


gi 12407367 


Homo sapiens 


tripartite motif protein TRIM2 


769 


30 


727 


gil504026 


Homo sapiens 


similar to C.elegans protein (Z37093) 


5833 


99 


727 


gi2896796 


Homo sapiens 


D1013901 


5115 


99 


727 


gi2522322 


Homo sapiens 


PTPLt-associated RhoGAP 


1497 


36 


728 


gi 13274 120 


Homo sapiens 


dJ55C23.5.l (vanin 3, isoform l) 


1467 


99 


728 


gi7l60973 


Homo sapiens 


VNN3 protein 


1213 


96 


728 


gi6 102996 


Mus musculus 


Vanin-3 


1018 


79 


729 


gi9581879 


Homo sapiens 


disintegrin metalloproteinase with 
thrombospondin repeats 


5723 


99 


729 


gi!9171176 


Homo sapiens 


metalloprotease disintegrin 15 with 
thrombospondin domains 


1669 


50 


729 


gil 1095299 


Rattus 
norvegicus 


ADAMTS-1 


1772 


40 


730 


gi21063967 


Drosophila 
melanogaster 


AT05453p 


396 


32 


730 


gi59H409 


Drosophila 
melanogaster 


fuzzy 


396 


32 


730 


gi2564657 


Drosophila 
melanogaster 


Fuzzy 


396 


32 


*71 1 


• i ci nni 

gil5217171 


Homo sapiens 


CD81 partner 3 


2302 


100 


731 


•if yl OOA1 *7 

gil5488017 


ft . 

Homo sapiens 


EWI2 


2302 


100 | 


731 


gi 15593237 


Mus musculus 


immunoglobulin superfamily receptor 
PGRL 


2186 


92 


131 


gl 15Z17171 


Homo sapiens 


CD81 partner 3 


3200 


100 


732 


gil5488017 


Homo sapiens 


EWI2 


3200 


100 


732 


gi 15593237 


Mus musculus 


immunoglobulin superfamily receptor 
PGRL 


2867 


88 


733 


ei152l7l71 




KsUo i partner j 


1 jUj 


OA 


733 


gil5488017 


Homo sapiens 


EWI2 


1303 


96 


733 


gi22266726 


Homo sapiens 


LLR-D1 precursor 


1303 


96 


734 


gi21748480 


Homo sapiens 


FU00271 protein 


605 


100 


734 


gi22266726 


Homo sapiens 


LER-D1 precursor 


514 


79 


734 


gil5217171 


Homo sapiens 


CD81 partner 3 


514 


79 


735 


gi2196872 


Homo sapiens 


Lsc homologue 


203 


30 i 


735 


gil3'89756 


Mus musculus 


Lsc 


199 


31 


735 


gil 1276027 


Rattus 


LSC 


199 


31 
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norvegicus 








736 


gil4336728 


Homo sapiens 


possible integral membrane 


331 


32 


736 


gi 18043242 


Mus musculus 


R1KEN cDNA 24000 10G1 5 gene 


331 


31 


736 


gi8895014 


Hepatitis B 
virus 


HBsAg 


68 


48 


737 


gi20071204 


Mus musculus 


Similar to paraspeckle protein 1 


185 


28 


737 


gi 18 104577 


Homo sapiens 


paraspeckle protein 1 alpha isoform 


175 


27 


737 


gil3528666 


Homo sapiens 


Similar to splicing factor 
proline/glutamine rich (polypyrimidine 
tract-binding protein-associated) 


179 


31 


738 


gil2002000 


Homo sapiens 


My029 protein 


415 


100 


738 


gi348140 


Human T- 
lymphotropic 
virus 2 


rex 


68 


39 


738 


gi404041 


Human T- 
lymphotropic 
virus 2 


rex protein 


68 


39 


739 


gi4680090 


Human 

immunodeficien 
cy virus type 1 


envelope glycoprotein 


89 


31 


740 


gi21627272 


Drosophila 
melanogaster 


CG12765-PA 


166 


38 


740 


gi 19528077 


Drosophila 
melanogaster 


AT24025p 


166 


38 


740 


gil066820 


Murray Valley 

encephalitis 

virus 


nonstructural protein 


66 


28 


741 


gi9916 


Plasmodium 

/» i * 

falciparum 


liver stage antigen 


4oo 


JLO 


741 


gil747 


Oryctolagus 
cuniculus 


trichohyalin 


414 


24 


741 


gi295941 


Ovis aries 


trichohyalin 


395 


24 


742 


*/\0 A C A O C 

gi9845485 


Homo sapiens 


protocadherin-9 


OZJD 


1 AO 


742 


gi!505452i 


Homo sapiens 


protocadherin-S 


3390 


58 


742 


gil316l060 


Homo sapiens 


protocadherin 1 1 


3382 


CO 

58 


743 


gi5688958 


Homo sapiens 


PMMLP 


2405 


100 • 


743 


gi21594625 


Mus musculus 


RIKEN cDNA 4931406N15 gene 


2241 


92 


743 


gil6797814 


Drosophila 
melanogaster 


phosphomannomutase 45A 


1194 


51 


744 


gi2 1734445 


Rattus 
norvegicus 


BMP/Retinoic acid-inducible neurai- 
specific protein-2 


3987 


94 


744 


gi20988899 


Mus musculus 


similar to deleted in bladder cancer 
chromosome region candidate I 


2952 


*7A 

70 


744 


•O 1 »10 A A Aft 

gi2 1734447 


Rattus 
norvegicus 


BMP/Retinoic acid-inducible neural- 
specific protein-3 




/u 


745 


gi2739353 


Homo sapiens 


ZNF91L 


2075 


69 


/4j 


gitui / /zz 


Homo sapiens 


repressor transcriptional factor 




71 


745 


gi4559318 


Homo sapiens 


BC273239 1 


2031 


67 


746 


gil017722 


Homo sapiens 


repressor transcriptional factor 


2144 


73 


746 


gi2739353 


Homo sapiens 


ZNF91L 


2054 


70 


746 


gil86774 


Homo sapiens 


zinc finger protein 


2035 


70 


747 


gil9683999 


Homo sapiens 


coated vesicle membrane protein 


1010 


99 


747 


gil212965 


Homo sapiens 


transmembrane protein 


1010 


99 


747 


gil213221 


Rattus 
norvegicus 


transmembrane protein 


1006 


98 
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748 


gi 1199524 


Homo sapiens 


acid phosphatase 


2036 


98 


748 


gi34263 


Homo sapiens 


acid phosphatase precursor protein 


2036 


98 


748 


gil3111975 


Homo sapiens 


acid phosphatase 2, lysosomal 


2032 


98 


749 


gi 15625570 


Homo sapiens 


centaurin beta5 


2970 


83 


749 


gi4688902 


Homo sapiens 


centaurin beta2 


1708 


64 


749 


gi436228 


Homo sapiens 


Start codon is not identified 


1387 


70 


750 


gi 10197642 


Homo sapiens 


MDS022 


647 


100 


750 


gil9683046 


Dictyostelium 
discoideum 


HYPOTHETICAL 21.8 KDA PROTEIN. 
3/101 


94 


26 


750 


gi6841554 


Homo sapiens 


HSPC166 


93 


24 


751 


gi5630080 


Homo sapiens 


similar to HUB1; similar to BAA24380 
(PID:g2789430) 


696 


48 


751 


gi2789430 


Homo sapiens 


repressor protein 


702 


39 


751 


gi 186 14026 


Homo sapiens 


zinc finger DNA binding protein p71 


1004 


41 


752 


gil2140290 


Homo sapiens 


bA12M19.2.1 (vacuolar protein sorting 
protein 16 (VPS 16)) 


2885 


92 


752 


gi 11345382 


Homo sapiens 


vacuolar protein sorting protein 16 


2885 


92 


752 


gi 19343731 


Musmusculus 


vacuolar protein sorting 16 (yeast 
homolog) 


2803 


89 


753 


gi20987877 


Mus musculus 


similar to Nogo receptor 


905 


58 


753 


gi9280025 


Macaca 
fascicularis 


Nogo receptor 


808 


49 


753 


gi 15080005 


Homo sapiens 


nogo receptor 


796 


48 


754 


gi 177870 


Homo sapiens 


alpha-2«macroglobulin precursor 


2714 


39 


754 


gi579592 


Homo sapiens 


alpha 2-macroglobulin 690-730 


2708 


39 


754 


gi579594 


Homo sapiens 


alpha 2-macroglobulin 690-740 


2700 


39 


755 


gi4929790 


Homo sapiens 


angiopoietin-related protein 3 


1423 


89 


755 


gil3 159474 


Homo sapiens 


CG006-alt2 


1416 


88 


755 


gi5639997 


Mus musculus 


angiopoietin-related protein 3 


1109 


77 


756 


gi200057 


Mus musculus 


neuronal glycoprotein 


4821 


87 


756 


gi563133 


Rattus 
norvegicus 


BIG-1 protein 


4778 


87 


756 


gil0l60i2 


Rattus 
norvegicus 


neural cell adhesion protein BIG-2 
precursor 


3867 


68 


757 


gi6273399 


Homo sapiens 


melanoma-associated antigen MG50 


344 


33 


757 


gi 15 04040 


Homo sapiens 


similar to D.melanogaster 
peroxidasin(U11052) 


344 


33 


757 


gil4495561 


Homo sapiens 


brain tumor associated protein LRRC4 


324 


27 


758 


gi6273399 


Homo sapiens 


melanoma-associated antigen MG50 


344 


33 


758 


gi 15 04040 


Homo sapiens 


similar to D.melanogaster 
peroxidasin(Ul 1052) 


344 


33 


758 


gi 14495561 


Homo sapiens 


brain tumor associated protein LRRC4 


329 


26 


759 


gi5525078 


Rattus 
norvegicus 


seven transmembrane receptor 


5062 


72 


759 


gi21929093 


Homo sapiens 


seven transmembrane helix receptor j 


1712 


88 


759 


gi4 164023 


Bos taurus 


latrophihn 2 splice variant baaaf 


383 


27 


760 


gil0440398 


Homo sapiens 


FLJ00032 protein 


1261 


57 


760 


gil 1917507 


Homo sapiens 


HPF1 protein 


1258 


60 


760 


gil3752754 


Homo sapiens 


zinc finger 1111 


1253 


60 


761 


gi3628757 


Homo sapiens 


FIC1 


1436 


54 


761 


gil3097633 


Homo sapiens 


Similar to ATPase, Class I, type 8B, 
member 1 


1221 


60 


761 


gi20147219 


Arabidopsis 
thaiiana 


Atlg59820/F23H11_14 


1637 


41 
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762 


gil 1527987 


Gallus gallus 


immunoglobulin-like receptor CHIR-A 


97 


30 


762 


gi432214 


Human 

immunodeficien 
cy virus type 1 


envelope glycoprotein gpl20 


43 


39 


762 


gil5026993 


Homo sapiens 


MUC5AC protein 


64 


38 


763 


gil 1558486 


Homo sapiens 


t% I % #11* • 4 4 1 1 j 

B-cell lymphoma/leukaemia 1 1 A short 
form 


1314 


99 


763 


gi7546791 


Mus musculus 


CTIP1 protein 


1149 


99 


763 


gi7650184 


Mus musculus 


ecotropic viral integration site 9 isoform 
C 


1155 


95 


764 


gi22085890 


Rattus 
norvegicus 


FHA-HIT 


1426 


82 


764 


gi2 1430028 


Drosophila 
melanogaster 


GM01362p 


338 


40 


764 


gi21166012 


Dictyostelium 
discoideum 


24 1 00 1 6G2 1 RIK PROTEIN 


279 


26 


r-i f r 

765 


gi22085890 


Rattus 
norvegicus 


r-»r T a T TTT 1 

FHA-HIT 


214 


oo 


765 


gi5764101 


Homo sapiens 


polynucleotide kinase-3-phosphatase 


Af 


50 


765 


gi5712131 


Homo sapiens 


DEMI protein 


93 


50 


766 


"TIAOf OA A 

gi22085890 


Rattus 
norvegicus 


FHA-HIT 


no 
278 


89 


766 


gi5764101 


Homo sapiens 


polynucleotide kinase-3 -phosphatase 


t AA 

109 


A £ 

46 


766 


gi5712131 


Homo sapiens 


DEMI protein 


107 


A £ 

46 


768 


gi 15 186770 


Homo sapiens 


lysyl oxidase-like protein 


1818 


96 


768 


gi 14009597 


Homo sapiens 


lysyl oxidase-like 3 protein 


1 O 1 o 

1818 


96 


768 


gil 5030096 


Mus musculus 


Similar to lysyl oxidase-like 2 


1715 


ao 

92 


769 


gi3954938 


Homo sapiens 


acetylglucosaminyltransferase-like 
protein 


2298 


"7A 

70 


769 


gi3954978 


Mus musculus 


acetylglucosaminyltransferase-like 
protein 


2298 


70 


769 


gi!0834722 


Homo sapiens 


PP5656 


892 


91 


770 


gi7209723 


Homo sapiens 


WD-repeat like sequence 


2476 


99 


770 


gi82l7485 


Homo sapiens 


dJ1092Al 1.3 (WD repeat domain) 


2473 


99 


770 


gi720972l 


Mus musculus 


DD57 


2243 


88 


771 


gil8676632 


Homo sapiens 


FLJ00215 protein 


1943 


99 


771 


gil8447l98 


Drosophila 
melanogaster 


GH09355p 


140 


19 


771 


gi29567l 


Saccharomyces 
cerevisiae 


selected as a weak suppressor of a mutant 
of the subunit AC40 of DNA dependant 
RNA polymerase I and III 


119 


22 


772 


gil0799l66 


Homo sapiens 


protein kinase Njmu-Rl 


1915 


99 


772 


gi2li04460 


Homo sapiens 


OK/SW-CL.19 


549 


100 


772 


gil4290030 


Human 

immunodeficien 
cy virus type 1 


pol protein 


68 


30 


773 


gi4 186023 


Homo sapiens 


CDS2 protein 


2376 


100 


773 


gi!9344052 


Homo sapiens 


similar to PHOSPHATTDATE 
CYTIDYLYLTRANSFERASE 2 (CDP- 
DIGLYCERIDE SYNTHETASE 2) 
(CDP-DIGLYCERIDE 
PYROPHOSPHORYLASE 2) (CDP- 
DIACYLGLYCEROL SYNTHASE 2) 
(CDS 2) (CTPrPHOSPHATIDATE 
CYTIDYLYLTRANSFERASE 2) (CDP- 


2376 


100 
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DAG SYNTHASE 2) (CDP-DG 
SYNTHETASE 2)... 






773 


gil3277972 


Mus musculus 


Similar to CDP-diacylglyceroI synthase 
(phosphatidate cytidylyltransferase) 2 


2289 


96 


774 


gi 17862928 


Drosophila 
melanogaster 


SD03549p 


125 


35 


774 


gi 18077663 


Mus musculus 


cockayne syndrome group A 


117 


38 


774 


gil4091657 


Mangifera 
indica 


F6N15.8-hke protein 


107 


29 


776 


gil8676664 


Homo sapiens 


FLJ00231 protein 


1473 


99 


776 


gi 16303748 


Homo sapiens 


tweety-like protein 2 


1053 


41 | 


776 


gil6303750 


Mus musculus 


tweety homolog 2 


987 


39 


777 


gi8 118032 


Homo sapiens 


orphan G-protein coupled receptor 


939 


98 


777 


gil6877193 


Homo sapiens 


G protein-coupled receptor, family C, 
group 5, member C 


939 


98 


777 


gi9588669 


Homo sapiens 


GPRC5C 


939 


98 


778 


gi20380605 


Mus musculus 


RIKEN cDNA 8430424D23 gene 


836 


91 


778 


gil6769562 


Drosophila 
melanogaster 


LD38910p 


333 


47 


778 


gi7302978 


Drosophila 
melanogaster 


CG8441-PA 


333 


47 


779 


gil6041781 


Homo sapiens 


Similar to RIKEN cDNA 0710001C05 
gene 


776 


99 


779 


gi21430012 


Drosophila 
melanogaster 


GH27470p 


333 


53 


779 


gi!5074454 


Sinorhizobium 
meliloti 


CONSERVED HYPOTHETICAL 
PROTEIN 


239 


43 


780 


gil39590l8 


Homo sapiens 


endothelial cell-selective adhesion 
molecule 


902 


100 


780 


gil3991773 


Mus musculus 


endothelial cell-selective adhesion 
molecule 


643 


70 ! 


780 


gil814277 


Homo sapiens 


A33 antigen precursor 


229 


34 


781 


gi8164184 


Homo sapiens 


22kDa peroxisomal membrane protein- 
like 


1013 


100 


781 


gil5422l71 


Homo sapiens 


22 kDa peroxisomal membrane protein 2 


1013 


100 


781 


gi297437 


Rattus 
norvegicus 


peroxisomal membrane protein 


798 


76 


782 


gi7621329 


Streptococcus 
pyogenes 


Sicl.245 


214 


39 


782 


gi7620883 


Streptococcus 
pyogenes 


Sicl.23 


215 


39 


782 


gi7620875 


Streptococcus 
pyogenes 


Sicl.19 


215 


39 


783 


gi62877 


Gallus gallus 


type VI collagen alpha-2 subunit 
preprotein 


751' 


41 


783 


gi62882 


Gallus gallus 


type VI collagen subunit alpha2 


751 


41 


783 


gi2U616 


Gallus gallus 


type VI collagen, alpha-2 subunit 


747 


45 


784 


gil7945608 


Drosophila 
melanogaster 


RE26969p 


829 


48 


784 


gi3877350 


Caenorhabditis 
elegans 


contains similarity to Pfarn domain: 
PF01598 (Sterol desaturase), 
Score=307.6, E-value=4.7e-89, N=l 


572 


38 


784 


gi3877351 


Caenorhabditis 
elegans 


contains similarity to Pfam domain: 
PF01598 (Sterol desaturase), 
Score=303.0, E-value=l.le-87, N=l 


546 


38 
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a 
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785 


gi 17066 106 


Homo sapiens 


1 TPJaJ— T _ _ C— ..... 

jNovex- J i inn isoiorm 


8817 


"oo 
yy 


785 


giz 123 8628 


Sparisoma 
viride 


titin-like protein 

— _ . . 


•^10 


OZ 


/OJ 


glzlZJooJU 


Sparisoma 
aurorrenaium 


titin-like protein 


S1 0 


£7 
OZ 


787 

/o/ 


r»i77108AO 
glZZjU5-+U 


VJulKgO DliODa 


IlUIlD 




^4 


787 


m77in878 
glZZoUoZo 


L/ioon eciuic 


nuiLD 




^0 


787 
10/ 


m 0770001 
giyz fyyy 1 


Conn /\i o 

oequoia 
sernpervirens 


maiurase 




JO 


788 
/OO 


oil Q.£n(\fk\ o 
glloO/OOlU 


nomo sapiens 


rjuJvv/zvH pruiem 


204 


97 


7852 
/ OO 


glJUUZJOO 


ivius inuscuius 


r leniy oi onjs, ruon 


906 


9S 


788 
/oo 


oi 1 407 


IVIUS IUUSCUIUS 


C 141 PI 


1^4 


4S 

*+J 


780 


oil R£7AA1 n 
glloO/OOlU 


nomo sapiens 


rLsiuuzuH proiem 


Iff) 


97 
z / 


78Q 

/oy 


oi^nn7S88 


ft/fne mi icfi ill i c 

ivius musouiub 


Pl^ntv nf <JH1c« PO^H 

i leniy oi onjs, ruon 


990 


9S 


78Q 
toy 


oil 407^6^ 


\/fno mtior»iilne 

ivius muscuius 




140 


DO 




oil 87/181 
gl lOZ*tO D 


nomo sapiens 


prenurooidsi coiiagendse inmuiiur 


J J 1 


oo 


7Q0 


oiA0n0Q4 


nomo sapiens 


TTMP 

1 UVLT 


s^l 


88 
oo 


7Q0. 


oil 80180 

guoiooz 


nomo sapiens 


coiidgendse lnniouor 




oo 


70 1 

/yi 


oi7 1 1 07 1 f*. 
gl / 1 1UZ10 


nomo sapiens 


^>-iype leciin-iiKe recepior-i 


8S1 
0 J 1 


0Q 

yy 


701 


oi71 00711 
gl / L\Jy/JL 


Homo sapiens 


L^-iype lecun-iiKe recepior-z 


7^A 


J i 


70 1 

/y i 


oi1Q07Q8? 

giisjuzs/oz 


dos xaurus 


iecim-iiK.e oxiuizeu j_<jjju rei/cpior 




Tl 


792 


gi5802604 


Cavia porcellus 


UDP glucuronosyltransferase UGT2A3 


1783 


73 


707 


rril 0187QAT1 


Mus musculus 


PTK'RM rHMA 7010171 T07 ophip* 

KiisjirN cuina zuiujzuu/ gene 


1700 


AO 


707 

lyl 


rri/17-\17£/£ 

_gl4/ J J /OO 


Homo sapiens 


uur giucuronosyiiransrerase 




o/ 


70*7 

lyo 


rri1£880QO 


Homo sapiens 


P17/^1 1 7 
1OZ011 Z 


78£ 
/OO j 


01 


701 

/yj 


glOO*+lZZo 


Homo sapiens 


ucpr^co 
xlorC-ZoV 


^18 
Ojo 


78 
/ o 


701 
fyD 


oi7 1A1 8688 
glZlOloOoo 


ivius muscuius 


PIITT7W pnWA ^810d08P14 (TAnp 

JxiiSJiiN ci-JiNA jQjsJHyoy^iH gene 


44S 


^9 

■JZ 


794 


gi9963861 


Homo sapiens 


Cytl9 


1729 


99 


7Q4 
/y*t 


oi 1 -\488A4^ 
gll»)*+OoO-+J 


ivius muscuius 


meinyiuansieiase v^yiis' 


1 J D J 


7A 
/o 


794 


gil8 150409 


Rattus 
norvegicus 


S-adenosylmethionine:arsenic (EQ) 
methyltransferase 


1516 


76 


795 


•11 0700/10 

gll lo//Z43 


Homo sapiens 


bor 1/rzY 1 1 cnimenc protein 


1 Q-\7 


o< 


795 


giz 161 9996 


Homo sapiens 


peter pan homolog (Drosophila) 


ZUoU 




795 


gil4602631 


Homo sapiens 


peter pan (Drosophila) homolog 


2080 


99 


796 


gi20330550 


Homo sapiens 


NK inhibitory receptor precursor 


799 


no 
9o 


796 


gi20380183 


Homo sapiens 


similar to CMRF35 leukocyte 
immunoglobulin-like receptor 


727 


92 


796 


gi20381405 


Homo sapiens 


similar to CMRF35 leukocyte 
immunoglobulin-like receptor; CMRF35 
antigen 


423 


57 


797 


gi20330550 


Homo sapiens 


NK inhibitory receptor precursor 


799 


no 

98 ! 


797 


gi20380183 


Homo sapiens 


similar to CMRF35 leukocyte 
immunoglobulin-like receptor 


727 


92 


797 


gi20381405 


Homo sapiens 


similar to CMRF35 leukocyte 
immunoglobulin-like receptor; CMRF35 
antigen 


423 


57 


798 


gi20330550 


Homo sapiens 


NK inhibitory receptor precursor 


1469 


94 


798 


gi20380183 


Homo sapiens 


similar to CMRF35 leukocyte 
immunoglobulin-like receptor 


690 


84 


798 


gi20330544 


Mus muscuius 


polymeric immunoglobulin receptor 3 
precursor 


416 


52 


799 


gil8307481 


Homo sapiens 


phosphoinositide-binding proteins 


2122 


100 


799 


gi3930781 


Homo sapiens 


connector enhancer of KSR-like protein 
CNK1 


346 


34 



WO 2004/080148 



PCT/US2003/030720 



150 
TABLE 2 A 



SEQ 
ID 


Hit ID 


Species 


Description 


S 

score 


Percentage 
identity 


799 


gi4151807 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


455 


37 


800 


gil5929988 


Homo sapiens 


Similar to TLH29 protein precursor 


417 


89 


800 


gil 1493982 


Homo sapiens 


TLH29 protein precursor 


274 


72 


800 


gi20 147034 


Mus musculus 


interferon stimulated gene 12 


235 


68 


801 


gil5929988 


Homo sapiens 


Similar to TLH29 protein precursor, 
clone MGC:21991 IMAGE:4398045, 
mRNA, complete cds. 


445 


100 


801 


AAW54040 


Homo sapiens 


Human interferon-inducible protein, 
HIFI. 


432 


97 


801 


gil 1493982 


Homo sapiens 


TLH29 protein precursor (TLH29) 
mRNA, complete cds. 


303 


70 


802 


gil2082725 


Mus musculus 


B cell phosphoinositide 3-kinase adaptor 


3561 


84 


802 


gil2082723 


Gallus gallus 


B cell phosphoinositide 3-kinase adaptor 


2840 


69 


802 


gi20987486 


Homo sapiens 


similar to B cell phosphoinositide 3- 
kinase adaptor 


1830 


97 


803 


gi7959809 


Homo sapiens 


PRO1082 


545 


100 


803 


gi7767407 


Avian 
infectious 
bronchitis virus 


5a protein 


61 


26 


803 


gil5073792 


Sinorhizobium 
meliloti 


PUTATIVE FOSMIDOMYCIN 
RESISTANCE ANTIBIOTIC 
RESISTANCE TRANSMEMBRANE 
PROTEIN 


71 


38 


804 


gil5384843 


Homo sapiens 


NTB-A receptor 


1700 


100 


804 


gil 5384841 


Homo sapiens 


activating NK receptor 


1687 


99 


804 


gi9887089 


Mus musculus 


lymphocyte antigen 108 isoform 1 


637 


44 


805 


gil 7979255 


Arabidopsis 
thaliana 


AT5g49550/K6M13_10 


211 


72 


805 


gil0177621 


Arabidopsis 
thaliana 


phytoene dehydrogenase-like 


195 


75 


805 


gil4023915 


Mesorhizobium 
loti 


phytoene dehydrogenase 


182 


62 


806 


gil4270364 


Mus musculus 


Epigen protein 


386 


71 


806 


gi755468 


Xenopus laevis 


transmembrane protein 


120 


36 


806 


gi7799191 


Mus musculus 


tomoregulin-1 


125 


52 


807 


gi 14270364 


Mus musculus 


Epigen protein 


386 


71 


807 


gi755468 


Xenopus laevis 


transmembrane protein 


120 


36 


807 


gi7799191 


Mus musculus 


tomoregulin-1 


125 


52 


808 


gil4270364 


Mus musculus 


Epigen protein 


386 


71 


808 


gi755468 


Xenopus laevis 


transmembrane protein 


120 


36 


808 


gi7799191 


Mus musculus 


tomoregulin-1 


125 


52 


809 


gi3068592 


Mus musculus 


punc 


201 


41 


809 


gi22003417 


Danio rerio 


neogenin 


193 


40 


809 


gil881477 


Mus musculus 


neogenin protein 


167 


33 


810 


gil5072404 


Raja erinacea 


organic solute transporter beta 


92 


41 


810 


gi 143486 


Bacillus subtilis 


levansucrase 


59 


37 


810 


gi 143484 


Bacillus subtilis 


levansucrase (sacB) 


58 


35 


811 


gil 8650588 


Homo sapiens 


retinoic acid early transcript 1 


1124 


99 


811 


gil3128925 


Homo sapiens 


ULBP2 protein 


1070 


94 


811 


gi21961213 


Homo sapiens 


UL16 binding protein 2 


1070 


94 


812 


gi9280405 


Homo sapiens 


adlican 


1372 


46 


812 


gi3328186 


Caenorhabditis 
elegans 


hemicentin precursor 


475 


29 
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812 


gi 14575679 


Homo sapiens 


hemicentin 


493 


28 


814 


gi9280405 


Homo sapiens 


adlican 


2438 


35 


814 


gi 14575679 


Homo sapiens 


hemicentin 


688 


25 


814 


gi3328186 


Caenorhabditis 
elegans 


hemicentin precursor 


586 


26 


815 


gi21619635 


Homo sapiens 


similar to Alu subfamily SQ sequence 
contamination warning entry 


270 


60 


815 


gi6650810 


Homo sapiens 


PRO1902 


264 


63 


815 


gi3002527 


Homo sapiens 


neuronal thread protein AD7c-NTP 


247 


62 


816 


gi6707435 


Homo sapiens 


apolipoprotein A5 


1864 


100 


816 


gil2240284 


Mus musculus 


apolipoprotein A5 


1310 


72 


816 


gi6707431 


Rattus 
norvegicus 


apolipoprotein A5 


1293 


72 


817 


gi6707435 


Homo sapiens 


apolipoprotein A5 


1864 


100 


817 


gi 12240284 


Mus musculus 


apolipoprotein A5 


1310 


72 


817 


gi6707431 


Rattus 
norvegicus 


apolipoprotein A5 


1293 


72 


818 


gil2751065 


Homo sapiens 


PNAS-25 


360 


81 


818 


gi!208732 


Drosophila 
melanogaster 


ovary2 


276 


33 


818 


gi21428518 


Drosophila 
melanogaster 


LD33046p 


275 


33 


819 


gi577l420 


Homo sapiens 


group IID secretory phospholipase A2 


852 


100 


819 


gi6453793 


Homo sapiens 


phospholipase A2 


846 


99 ; 


819 


gil0862736 


Homo sapiens 


dJ169023.3 (phospholipase A2 group 
IID) 


846 


99 


820 


gi60 15448 


Hylobates lar 


dopamine receptor D4 


79 


35 


820 


gi5059331 


Human 

papillomavirus 
type 83 


major capsid protein 


85 


29 


820 


gil3278034 


Mus musculus 


Similar to selectin, platelet (p-selectin) 
ligand 


83 


35 


821 


gi 12654883 


Homo sapiens 


rTS beta protein 


2112 


ess 

96 


821 


gi 1150421 


Homo sapiens 


rTSbeta 


2112 


96 


821 


gil 1094019 


Homo sapiens 


RTS beta 


2106 


96 


822 


gi 12803 167 


Homo sapiens 


nucleosome assembly protein 1-iike 1 


1728 


99 


822 


gil 89067 


Homo sapiens 


NAP 


1728 


99 


822 


gi220496 


Mus musculus 


nucleosome assembly protein- 1 


1718 


98 


823 


gi 13432042 


Homo sapiens 


integrin-linked kinase-associated 
serine/threonine phosphatase 2C 


2009 


99 


823 


gi20072498 


Mus musculus 


Similar to protein phosphatase 2C 


1926 


94 


823 


gi3777604 


Rattus 
norvegicus 


protein phosphatase 2C 


1922 


94 


824 


gi7768636 


Xenopus laevis 


Kielin 


242 


36 


824 


gi6979313 


Mus musculus 


• cysteine-rich repeat-containing protein 

CK1M1 


183 


30 


824 


gill527817 


Homo sapiens 


CRIM1 protein 


178 


30 


825 


gi21 928259 


Homo sapiens 


seven transmembrane helix receptor 


1023 


100 


825 


gil8480746 


Mus musculus 


olfactory receptor MOR261-10 


864 


84 


825 


gil8480744 


Mus musculus 


olfactory receptor MOR261-9 


858 


82 


826 


gi2 1928655 


Homo sapiens 


seven transmembrane helix receptor 


1458 


93 


826 


gil8480746 


Mus musculus 


olfactory receptor MOR261-10 


1280 


79 


826 


gil 8480744 


Mus musculus 


olfactory receptor MOR261-9 


1258 


78 


827 


gi6760369 


Mus musculus 


ODZ3 


364 


95 
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827 


gi4760780 


Mus musculus 


Ten-m3 


364 


95 


827 


gi5307761 


Danio rerio 


ten-m3 


310 


78 


828 


gi21205852 


Homo sapiens 


T-ceil activation Rho GTPase activating 
protein; TA-GAP 


3756 


100 


828 


gi21205854 


Homo sapiens 


T-cell activation Rho GTPase activating • 
protein splice variant 1; TA-GAP 


2850 


100 


828 


gi 16265 93 8 


Homo sapiens 


FKSG15 


2439 


98 


829 


gil0432396 


Homo sapiens 


dJ947L8.L5 (novel CUB domain protein) 


383 


62 


829 


gil4787176 


Mus musculus 


CSMD1 


373 


61 


829 


gi 14787181 


Homo sapiens 


CUB and sushi multiple domains protein 
1 short form 


369 


60 


830 


gil0432396 


Homo sapiens 


dJ947L8.1.5 (novel CUB domain protein) 


383 


62 


830 


gil4787176 


Mus musculus 


CSMD1 


373 


61 


830 


gil4787181 


Homo sapiens 


CUB and sushi multiple domains protein 
1 short form 


369 


60 


831 


gi532124 


Dictyostelium 
discoideum 


myosin IC 


525 


41 


831 


gi6472600 


Chara coral lina 


unconventional myosin heavy chain 


511 


43 


831 


gi9453839 4 


Chara coraliina 


myosin 


511 


43 


832 


gi8953751 


Arabidopsis 
thaliana 


myosin heavy chain MYA2 


646 


40 


832 


gi6472600 


Chara coraliina 


unconventional myosin heavy chain 


646 


39 


832 


gi9453839 


Chara coraliina 


myosin 


646 


39 


833 


gil7066528 


Canis familiaris 


immunoglobulin gamma heavy chain C 


42 


38 


833 


gi21 113238 


Xanthomonas 
campestris pv. 
campestris str. 
ATCC 33913 


IS 1595 transposase 


50 


43 


833 


gil6413516 


Listeria innocua 


similar to B. subtilis Ylal protein 


56 


37 ; 


Ol A 

834 


* *7**| inn A C 

gi7248845 


Homo sapiens 


testican-1 


2429 


99 ! 


OO A 

834 


gl793845 


Homo sapiens 


testican 


2429 


99 


834 


gi21265163 


Homo sapiens 


sparc/osteonectin, cwcv and kazal-like 
domains proteoglycan (testican) 


2425 


99 


835 


gil2804465 


Homo sapiens 


prostate cancer overexpressed gene 1 


1632 


59 


835 


gi3462515 


Homo sapiens 


PB39 


1632 


59 


835 


gil3111981 


Homo sapiens 


Similar to selectively expressed in 
embryonic epithelia protein-1 


283 


34 


836 


gil2804465 


Homo sapiens 


prostate cancer overexpressed gene 1 


1637 


59 


836 


gi34625l5 


Homo sapiens 


PB39 


1637 


59 


836 


gil3lll98l 


Homo sapiens 


Similar to selectively expressed in 
embryonic epithelia protein-1 


283 


34 


837 


gi7689029 


Homo sapiens 


uncharacterized hypothalamus protein 
HBEX2 


664 


100 


837 


gi 1 7391348 


Homo sapiens 


Similar to brain expressed, X-linked 1 


664 


100 1 


837 


gi9963771 


Homo sapiens 


ovarian granulosa cell 13.0 kDa protein 
nOK74 nomolog 


664 


100 


838 


gi4585574 


Rattus 
norvegicus 


Slitl 


287 


35 


838 


gil7380582 


Homo sapiens 


SLIT1 isoform B 


279 


35 


838 


gi4049587 


Homo sapiens 


Slit-2 protein ~" 


297 


35 


839 


gi 15488920 


Homo sapiens 


Similar to RIKEN cDNA 2010107G23 
gene 


632 


100 


839 


gi 19354289 


Mus musculus 


RIKEN cDNA 2010107G23 gene 


570 


92 | 


839 


gi2267416 


Hepatitis D 


hepatitis delta antigen 


76 


33 
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virus 








840 


gi21619776 


Homo sapiens 


o« -1 ±. r»TT/r!\r _TVk.T A OiCAAAl 1 T7AT 

Similar to RIKEN cDNA 260001 1E07 
gene 


A A A 1 

2491 


100 


840 


gi20988071 


Mus musculus 


n« *1 a nTT/TJXT .r\\1A O/TAAAI 1 T3AO 

Similar to RIKEN cDNA 260001 1E07 
gene 


AA 1 

921 


80 


840 


gil453l29l 


Mus musculus 


high mobility group protein isoform I 


on 
Of 


1A 

34 


841 


gi2 1 667649 


Drosophila 
melanogaster 


myosin binding subunit of myosin 
phosphatase 


231 


29 


841 


gi2 1392 168 


Drosophila 
melanogaster 


RE63915p 


Ol 1 


OA 

29 


841 


gi3 929221 


Homo sapiens 


i Kr 1-mteracting anKynn-reiaiea jwjr- 
ribose polymerase 


163 


DaC 


842 


• 1 A A AOAO/T 

gi 12408286 


Homo sapiens 


apolipoprotein L-IV splice variant a 


i /4Z 


inn 


842 


gi 13374351 


Homo sapiens 


apolipoprotein L4 


1 70ft 
1 /Zo 


GO 

77 


842 


* 1 A >4 AOOOC 

gi 12408285 


Homo sapiens 


apolipoprotein L-IV splice variant b 


10o3 


QQ 
70 


843 


gil2408286 


Homo sapiens 


apolipoprotein L-IV splice variant a 


1737 


99 


843 


gil3374351 


Homo sapiens 


apolipoprotein L4 


1 TOT 
1 723 


AO 

yy 


843 


gi!2408285 


Homo sapiens 


apolipoprotein L-IV splice variant b 


1678 


98 


844 


gi2 1744725 


Homo sapiens 


glycosyl-phosphatidyl-mositol-MAM 


OOAif 

2296 


1 AA 
100 


844 


gi7529598 


Homo sapiens 


dJ402N21.3 (novel protein with 
Immunoglobulin domains) 


1048 


100 


844 


gi7529599 


Homo sapiens 


dJ402N2U (novel protein) 


662 


100 


845 


gi21 744725 


Homo sapiens 


glycosyl-phosphatidyl-inositol-MAM 


5051 


100 


845 


gi7529598 


Homo sapiens 


dJ402N21.3 (novel protein with 
Immunoglobulin domains) 


1548 


99 


845 


gi7529597 


Homo sapiens 


dJ402N21.2 (novel protein with MAM 
domain) 


1474 


100 


846 


gi4007758 


Schizosaccharo 
myces pombe 


conserved protein; similar to S. cerevisiae 
YPR144C 


633 


34 


846 


gil066493 


Saccharomyces 
cerevisiae 


Weak similarity near C-tenninus to RNA 
Polymerase beta subunit (Swiss Prot. 
accession number PI 1213) and CCAAT- 
binding transcription factor (PIR 
accession number A3 6368) 


482 


32 


846 


gil8086412 


Arabidopsis 
thaiiana 


At2gl7230/T23Al .1 1 


42U 


AA 
44 


847 


gil4701768 


Homo sapiens 


vamo/vpsiy-iiKe protein 


1AQQ 


yo 


847 


gil4280050 


Homo sapiens 


Vps39/Vam6-like protein 


3499 


96 


847 


gil8857927 


Mus musculus 


VPS39 long isoform 


1 A AO 


yi 


848 


gi3811347 


Homo sapiens 


cytosolic phospholipase A2 beta 


1209 


44 


848 


gi4886978 


Homo sapiens 


cytosolic phospholipase A2 beta; 
cPLA2beta 


1 AAA 

1209 


A A 

44 


848 


gil90004 


Homo sapiens 


phosphatidylcholine 2-acylhydrolase 


512 


35 


849 


gi7291437 


Drosophila 
melanogaster 


CG4071-PA 


516 


51 


849 


gil7946619 


Drosophila 
melanogaster 


RH31535p 


217 


A r\ 

42 


849 


gi21645615 


Drosophila 
melanogaster 


CG4071-PB 


217 


42 


850 


gil316l409 


Mus musculus 


family 4 cytochrome P450 


444 


73 


850 


gi5263306 


Coptotermes 
acinaciformis 


family 4 cytochrome P450 


200 


41 


850 


gil3182964 


Mus musculus 


cytochrome P450 CYP4F13 


196 


38 


851 


gii3447749 


Homo sapiens 


fibroblast growth factor receptor 5 


2475 


98 


851 


gi 10944887 


Homo sapiens 


FGFR-like protein 


2475 


98 
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851 






KItK nnmnlnonnc ■far , tr»r rpppntnr 


9471 


07 


852 


gil3447749 


Homo sapiens 


fibroblast growth factor receptor 5 


2701 


99 


RS9 

O J J* 


gllUi/*f*rOO / 


nomo sapiens 


rvxriv-iiKe protein 


Z/Ul 


yy 


RS9 

OJ£ 


oi1"*1 R^61 8 
glU lOjulo 


rionio sapiens 


r ur nomoiogous iactor receptor 


Z04/ 


yo 


853 


©13183618 


Homo sapiens 


FGF homologous factor receptor 


583 


98 


853 


gil3447749 


Homo sapiens 


fibroblast growth factor receptor 5 


583 


98 


OJ5 


gliUy44oo7 


Homo sapiens 


FGFR-like protein 


583 


98 




gio4Jo5o 


Rattus 
norvegicus 


synaptotagmin VII 


2035 


95 


oj4 


guZoo/44o 


Rattus 
norvegicus 


synaptotagmin VIIs 


2035 


95 


RSA 

0J4 


glOI jO/oO 


Mus musculus 


• — Trff 

synaptotagmin VII 


zUzo 


95 


OJ-> 


guzuoj /uy 


Homo sapiens 


a disintegrin-like and metalloprotease 
(reprolysin type) with thrombospondin 
lype 1 moiii, iz 


oo4z 


1 r\(\ 
100 


OJJ 


rriSQ9'*7RR 
gu?yzj / OO 


no mo sapiens 


zinc metauoprotease aljajvl l o / 


OAQQ 

Z4©y 


Do 


OJ~> 


oi10171 178 

gLl7l / 1 J. / O 


nomo sdpiens 


meiaiiuproiease oismiegnn 10 wnn 

lliiUIuUObpUilUlll type 1 II1UL1L 


1 SQR 


^o 






r-frtmrt cam (*v\ c 
XiUUiU oapiCllo 


ouiuidr iu i Lni7 proiciit prcvurour 


1 ss 


RA 
50 


856 






nTFT97-1ilff» nrnt*»in 


OJ 




856 


gil 1493982 


Homo sapiens 


TLH29 protein precursor 


83 


44 


857 


<ri 1^*549 8 74 


\4iic Tniici^iiino 
iviuo liiUowUitla 


Similar trt f^r?-T»/^7 rtrrttein 

oiinudr io Lui-o / protein 


19QQ 




857 


gi21707079 


Homo sapiens 


similar to RIKEN cDNA 2210412D01 


1278 


75 


S57 


<ri4Q9Q6fn 


numo odpiens 


v^vji-o/ proiein 


ins7 

lUo / 


R1 
0 i 


O J o 


gl IjJHiO /*T 


N/fiic mncniiliic 

iviub musi/Uius 


oimuar to v^vji-o / protein 


1700 


/** 


858 


gi21707079 


Homo sapiens 


similar to RIKEN cDNA 2210412D01 


1279 


73 


RSR 


gw-yzyoiD 


Homo sapiens 


coi-o/ protein 


lUo/ 


ol 


859 


gi21595166 


Mus musculus 


RIKEN cDNA 4933425F03 gene 


1823 


83 


ojy 




Mus musculus 


Similar to RIKEN cDNA 4933425F03 
gene 


1822 


83 


RSQ 


m *01 /CI QRRR 

glZlOlVooo 


— : 

Homo sapiens 


oimuar to KiivriN cLiiNA 4yo34ZjruJ 
gene 


134Z 


y© 




<ri91 SOS 166 

£1Z, I J7J lOO 


iviub muhoutus 


rvjJSjDiN cjL/iN/v j*fZjri/j gene 


777Q 
ZZ/o 


SR 
OO 


860 


gil6359267 


Mus musculus 


Similar to RIKEN cDNA 4933425F03 
gene 


2277 


88 


860 


gi21619888 


Homo sapiens 


Similar to RIKEN cDNA 4933425F03 
gene 


1958 


99 


861 

OV 1 


cri 1 1 40146^ 


numo bapicns 


rivwzojz 


JUL 


7S 


R61 


oil/11 RQQAfi 


Homo sapiens 


r K^JU / 04 


771 
Z / l 




861 


«n*9 1 1 

glZl lV*HtO*t 


nomo sapiens 




Z04 


7A 
/U 


863 


gi21320872 


Mus musculus 


Cog8 


2747 


88 


RA^ 


gi i /oozyoo 


Drosophila 
melanogaster 




795 


45 


863 


gi5922593 


Schizosaccharo 
myces pombe 


pi008 


230 


21 


864 


et21618851 


Mnc mii^ciiliiQ 

IViUO UiUdvUlUa 


RTKFNrDNA 9610S10T01 crpnf» 


RR7 
ooz, 


07 


864 


gi20977573 


Danio rerio 


Ul small nuclear ribonucleoprotein C 


75 


32 


864 


gil562574 


Mus musculus 


Ul snRNP-specific protein C 


75 


32 


865 


gil7862312 


Drosophila 
melanogaster 


LD21841p 


646 


41 


865 


gi22294210 


Thermosynecho 
coccus 

elongatus BP-1 


WD-40 repeat protein 


123 


27 


865 


gi886024 


Thermomonosp 


PkwA 


124 


25 
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ora curvata 








866 


gi3878846 


Caenorhabditis 
elegans 


R05D7.3 


119 


37 


866 


gi!685056 


Xenopus laevis 


Pax6 


87 


24 


866 


gi8132389 


Xenopus laevis 


paired domain transcription factor variant 
A 


81 


23 


867 


gil2406973 


Homo sapiens 


alanine-glyoxylate aminotransferase 2 


2740 


100 


867 


gil944136 


Rattus 
norvegicus 


beta-alanine-pyruvate aminotransferase 


2255 


83 


867 


gil000448 


Rattus 
norvegicus 


Rat kidney AGT2 precursor 


2208 


81 


868 


gi!2406973 


Homo sapiens 


alanine-glyoxylate aminotransferase 2 


1870 


98 


868 


gil944136 


Rattus 
norvegicus 


beta-alanine-pyruvate aminotransferase 


1630 


86 


868 


gil000448 


Rattus 
norvegicus 


Rat kidney AGT2 precursor 


1583 


84 


869 


gi4165315 


Sus scrofa 


kallikrein 


468 


42 


869 


gil90263 


Homo sapiens 


plasma prekallikrein 


467 


38 


869 


gi8809781 


Homo sapiens 


plasma kallikrein precursor 


467 


38 


870 


gi 17985046 


Brucella 
melitensis 


GLYCOS YL TRANSFERASE 


137 


28 


870 


gi5478237 


Brucella 
melitensis 


Bme7 


137 


28 


870 


gi20906785 


Methanosarcina 
mazei Goel 


Transposase 


126 


25 


871 


gi4565840 


Cnemidophorus 
tigris 


cytochrome b oxidase 


76 


41 


871 


gil5023030 


Clostridium 
acetobutylicum 


Uncharacterized membrane protein, 
ortholog YYAS B.subtilis 


72 


44 


871 


gi7549241 


Barbatia tenera 


cytochrome oxidase subunit I 


71 


28 


872 


gi8705222 


Homo sapiens 


IL-17B receptor 


1998 


100 


872 


gi9246433 


Homo sapiens 


IL-17 receptor homolog precursor 


1996 


99 


872 


gi9246429 


Mus musculus 


IL-17 receptor homolog precursor 


1504 


75 


873 


gil8676472 


Homo sapiens 


FLJ00133 protein 


6475 


100 


873 


gil 8676498 


Homo sapiens 


FLJ00146 protein 


2352 


100 


873 


gil61467 


Strongyiocentro 
tus purpuratus 


fibropellin la 


1246 


38 


874 


gi213198 


Petromyzon 
marinus 


fibrinogen alpha chain 


89 


39 


874 


gil52923l7 


Drosophila 
melanogaster 


LD46863p 


87 


34 


874 


gi4877921 


Streptococcus 
pyogenes 


serum opacity factor precursor 


81 


33 


875 


gil4249936 


Homo sapiens 


Similar to S-adenosylhomocysteine 
hydrolase-like 1 


2582 


97 


875 


gil 7390493 


Mus musculus 


S-adenosylhomocysteine hydrolase-like 1 


2429 


92 


875 


gi2852125 


Homo sapiens 


S-adenosyl homocysteine hydrolase 
homolog 


2429 


92 


876 


gil4279990 


Homo sapiens 


ubiquitin UBF-fl 


458 


100 


876 


gi6706799 


Homo sapiens 


dJ447F3.2.1 (ubiquitin-conjugating 
enzyme E2 H10 (isoform 1)) 


214 


74 


876 


gil4043322 


Homo sapiens 


ubiquitin carrier protein E2-C 


214 


74 


877 


gi20086516 


Homo sapiens 


prominin-related protein 


4241 


99 


877 


gi20086520 


Mus musculus 


prominin-related protein 


3157 


73 
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877 


gil9909067 


Rattus 
norvegicus 


testosterone-regulated prominin-related 
protein 


2920 


69 


O /o 


gnJijy4ov 


Homo sapiens 


Translation may initiate at the ATG 
coaon at nucieouaes hu-4z or me Aiuai 
nucleotides 43-45 


Z1U4 






glzl4oJ o4u 


Sus scrofa 


fibrinogen-like protein 2 


AAA 
4U0 


JO 


878 


gi9229906 


Ciona 
intestinalis 


fibrinogen-like protein 


408 


36 


879 


gil 3 159480 


Homo sapiens 


1 ranslation may initiate at the Aid 
codon at nucleotides 40-42 or the ATG at 
nucleotides 43-45 


2100 


OO 1 

99 


o /y 


glzl4ojo40 


Sus scrofa 


fibrinogen-like protein 2 


4U0 


jO 


o/y 


giyzzyyuo 


Ciona 
inicsnnaiis 


fibrinogen-like protein 


AOS 
4Uo 


OO 


880 


gil3 159480 


Homo sapiens 


Translation may initiate at the ATG 
cooon ai nucieouaes *tv/-*tz or uic /\ i vj di 
nucleotides 43-45 


2100 


99 


ssn 

ooU 


m91 AS1SA£ 
glZ 1 *to J 0*tO 


ous scroitt 


iiunnogen-iiKe protein z 


*tUO 


JO 


ssn 

ooU 


giyzzyyuo 


Ciona 
intestinalis 


iionnogen-iiKe proiein 


HliO 


JO 


SSI 
oo 1 


gll I'H'J'tOJ 


nomo Sapiens 


xi\\/ZJ JO 




oo 


SSI 
Oo 1 


m 77701 1Q 
gl / / /ul jy 


nomo Sapiens 


PRH1799 
ris.u i /zz 


^1 R 

D lo 


60 
oy 


881 


gil 872200 


Homo sapiens 


alternatively spliced product using exon 

11A 

1 J/\ 


304 


72 


8S9 

OOZ 


oi 10175777 
gl 1U I / J / / / 


Odd 11 US 

hsiloni i ratio 
IldiUUUldllo 


proicdsc t> pec Hit/ lOi pildge IdlllUUd vil 
iCpivoSOl 


o / 


J*T • 


SS9 




Ypn orM i c 1 op\n c 


Tnh 
1 uu 


64 


J 1 


SS? 

OOZ 


ai71QQRS^S * 


DofttlQ 
IvaLLUj 

VvglvUu 


mrtnrmurl^rtvx/lQtP 1 tn3ncr»rtT*t'f»'r S 
lliviilvval UUAjfiaic ucuiopiJiid o 


67 


JJ 


883 


ail 8073362 


rTomo caTvipric 
xxv/iiiw oapivriio 


rvvtf"inft/p1iitamatp tran^nortf*T 

UjroUllW UldlltULV/ 11 ClilO^/Vl L\sl 


2552 


100 


883 


eil 1493652 


Homo sanienc. 


calcium channel blocker re55i<»tance 

VulvlUlll V/llCUlllVyl UivvIVvl 1 vOIijUUIvw 

protein CCBR1 


2552 


100 


883 


ei 13924720 


Homo sapiens 


cvstine/fflutamate transnorter xCTV 


2552 


100 


884 


gi507213 


Homo sapiens 


serine kinase 


1797 


97 


884 


eil4252988 


Homo ^aniens 


SRPfCla nrotein kinase 


1797 


97 


884 


ei3 135975 


Homo ^anien*? 


dJ422Hl 1 1 1 CSerine Kinase^ fisoform D 

UJTirArlll 1.1. X yLJVllllVv VVillOOby ^!OWX\JL 111 l J 


1796 


98 ! 


885 


ri9837288 


Homo sanienc. 


d-tvne lectin 


271 


54 


885 


gi6651065 


Homo sapiens 


lectin-like NK cell receptor LLT1 


271 


54 


sss 


ail 80441 5 R 


x lUUHJ oajjicilo 


Similar to Ipftin-lilr** f*pl1 rpppntnr 

OliltiiCU. IVJ lvvUll lllVC 1 > i V WH It-V/vpiUl 


270 


57 


886 


gi22164066 


Homo sapiens 


neurobiastoma-amplified protein 


7571 


99 


886 


gi5833317 


Oryzias latipes 


mixed lineage leukemia-like protein 


89 


23 


QS£ 
OOU 


m71 AS71 7 
gl / lUo/ I / 


Nicotiana 
tabacum 


MAxv-Dinaing protein Mrr i nomoiog 


so 


J 1 


SC7 
oo / 


rri771 £AO££ 

glZZ 1 04U00 


Homo sapiens 


neurooiasioma-ampimea protein 


05V / 


OS 

yo 


OO / 


gOoJJJ I / 


Oryzias latipes 


mixed lineage le*ukemia-like protein 


oy 


ZJ 


888 


gil7430957 


Ralstonia 

crilnnacpan \m 
ov/iaiiavbcu uiii 


HYPOTHETICAL TRANSMEMBRANE 
PROTETN 


453 


40 


888 


gi!3421965 


Caulobacter 

crescentus 

CB15 


M20/M25/M40 family peptidase 


377 


38 


888 


gi2330791 


Schizosaccharo 
myces pombe 


carboxypeptidase s precursor 


352 


33 


889 


gil 1558029 


Homo sapiens 


organic cation transporter 


1860 


99 


889 


gil808825l 


Homo sapiens 


Similar to hBOIT for potent brain type 
organic ion transporter 


1206 


97 
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889 


ei9663117 


Homo saniens 


organic cation transporter 


1852 


99 


890 


gi344112 


synthetic 
construct 


chloramphenicol acetyltransferase and 
carboxy terminal fusion protein 


57 


28 


890 


oi4 12284 


svnthetic 

construct 


carboxv terminal fusion orotein 


57 


28 


890 


gil3 122523 


Barbus 

brach vc eohal us 


ATP synthase 8 


56 


28 


891 


ei 13375 149 


Hninn saniens 


dJ1118M15 2 (Novel urotein) 


538 


98 


891 


gi7259265 


Mus musculus 


contains transmembrane (TM) region 


269 


48 


891 


ml 806278 


Ramie 
nnrvepicus 


H vconrotein 56 


143 


35 




ei 16589003 


Hrtitin <ianiftn<2 

IJLVJIUVJ oaLJlv/lia 


hromodomain-containinp: 4 


6353 


99 


892 


ffi9931486 


Mus musculus 


cell proliferation related protein CAP 


5635 


90 


892 


ffil8308125 


Mus musculus 


bromodomain-containinff Drotein BRJD4 
long variant 


5633 


90 


893 


m 15420828 


Homo saniens 


NOE3-1 


2504 


99 


893 


gi 19386926 


Rattus 
norvefficus 


optimedin form B 


2484 


98 


893 


ei 19386930 


Mus musculus 


oDtimedin form B 


2484 


98 


894 


gil0336599 


Xenopus laevis 


follistatin-related protein 


234 


32 




<ri!4900fi 


lMllC TnilQflllllQ 

j.yiuo uiudvuiua 


TTTF-hpta-indnHhlp nrntein 


225 


29 


894 


0120810011 


I^IIQ ITIllQPTlIlK! 
lYlUO lllUiUUlUo 


■fnl 1 i otafi n -1 1 If p 


223 


29 


895 


gi5002565 


Takifugu 

mhrinps 


cysteine conjugate beta-lyase 


1244 


55 


895 


ei758591 


Homo saniens 


p] utam i ne— nhen vl n vruvate 
aminotransferase 


1201 


51 


895 


ei 15425868 


Aedes aeevnti 


kvnurenine aminotransferase 


1188 


55 


896 


gi20522012 


Homo sapiens 


similar to an actin bundling protein, 
dematn 


1312 


57 


896 


gi2337952 


Homo sapiens 


actin-binding double-zinc-finger protein 


1312 


57 


896 


<ri9 1666411 




Q/*fin«.V»inH'TntT T TA/t rMYitpin 1 mpHiinn 
alAUl-UlUUlilg JU11V1 piULClll l lUvUlUili 

isoform 


1305 


57 


898 


gi67l6518 


Mus musculus 


doublecortin-1 ike kinase 


821 


52 


oVo 


m9161Q209 
giz loiyzuz 


noiuo sapiens 


l5 ill in at io uouuiecurun anu v^aivi kjiiooc- 
likel 


810 


51 






i^rosopmid. 

m pi a n f\o*i> c tpr 
1 1 1 C 1 d.1 1 aj t C 1 




778 
/ to 


45 


899 


gi9280108 


Macaca 

fa cr* i pi il 51 ri <; 


membrane-associated prostaglandin E 

svntfiasf-2 


1907 


97 


899 


gi9757960 


Arabidopsis 

llldUdlld 


contains similarity to glutathione-S- 

u ai lb ici ooc/ giumrv/UvJA.1 1 i* w gcnc lu.ivwv^^u. 

26 


396 


50 




oi 1 7944598 


melanogaster 


tf H17/>14n 


566 


42 




oi48Q4R54 


noino Sapiens 


couipieiucni v^icj t\ cnain preuursur 


1108 


99 

yy 




cri909RR805 


Urtfrirt cor\f one 

riuino Sapiens 


coiupierneni componeni 1,4 
subcomponent, alpha polypeptide 


1108 


99 
yy 


900 


gil2805247 


Mus musculus 


complement component 1, q 
subcomponent, alpha polypeptide 


945 


70 


901 


gil0176989 


Arabidopsis 
thaliana 


contains similarity to hedgehog- 
interacting protein~gene_id:MYH19.17 


86 


34 


901 


gi456384 


Blastocrithidia 
culicis 


apocytochrome B 


41 


50 


902 


gi2565046 


Homo sapiens 


CAGF28 


3775 


97 


902 


gi21707458 


Homo sapiens 


PAX transcription activation domain 


2709 


87 



WO 2004/080148 



PCT7US2003/030720 



158 
TABLE 2 A 



SEQ 
ID 


Hit ID 


Species 


Description 


S 

score 


Percentage 
identity 








... 

interacting protein 1 like 






902 


gi4336734 


Mus musculus 


Pax transcription activation domain 
interacting protein PTEP 


2473 


OA ' 

80 1 


903 


gi4336734 


Mus musculus 


Pax transcription activation domain 
interacting protein PTIP 


531 


93 


903 


gil4164561 


Xenopus laevis 


Swift 


467 


79 


903 


gil2382298 


Human 
herpesvirus 8 


OrfKlO 


48 


34 


904 


gil9353375 


Mus musculus 


RIKEN cDNA 11 10031102 gene 


745 


78 


904 


gi 15 929776 


Homo sapiens 


growth suppressor 1 


137 


41 


904 


gi5805194 


Rattus 
norvegicus 


leprecan 


137 


41 


905 


gi2443352 


Mus musculus 


platelet glycoprotein lb beta 


150 


45 


905 


gi2 1355064 


Homo sapiens 


platelet glycoprotein lb beta chain 


146 


43 


905 


gi306792 


Homo sapiens 


platelet glycoprotein lb beta chain 
precursor 


146 


43 


906 


gil3991166 


Homo sapiens 


sialic acid-binding immunoglobulin-like 
lectin-like short splice variant 


1174 


100 | 


906 


gil3991167 


Homo sapiens 


sialic acid-binding immunoglobulin-like 
lectin-like long splice variant 


1174 


100 


906 


gil4625822 


Homo sapiens 


Siglec-Lt 


1174 


100 


907 


gi21708018 


Mus musculus 


RIKEN cDNA 2700029E10 gene 


626 


66 


907 


gi7547035 


Homo sapiens 


SGC32445 protein 


474 


63 


907 


gi21626575 


Drosophila 
melanogaster 


CG30193-PA 


457 


55 


908 


gi6273399 


Homo sapiens 


melanoma-associated antigen MG50 


2748 


60 


908 


gi 1504040 


Homo sapiens 


similar to D.melanogaster 
peroxidasin(Ul 1 052) 


2748 


60 


908 


gi531385 


Drosophila 
melanogaster 


peroxidasin precursor 


1721 


42 


909 


gi6273399 


Homo sapiens 


melanoma-associated antigen MG50 


2748 


60 


909 


gi 1504040 


Homo sapiens 


similar to D.melanogaster 
peroxidasin(ul 1052) 


2748 


60 


909 


gi531385 


Drosophila 
melanogaster 


peroxidasin precursor 


1 *70 1 

l/Zl 


4Z 


910 


gi6273399 


Homo sapiens 


: — j : 

melanoma-associated antigen MG50 


llyy 


^0 

jy 


910 


gi 15 04040 


Homo sapiens 


similar to D.melanogaster 
peroxidasin^ u 1 1 u dzj 


77QQ 

Zfyy 


jy i 


910 


gl5313oJ 


Drosophila 
melanogaster 


peroxidasin precursor 


T7AR 
I /Uo 


*rl 


01 1 

91 1 


gllo lozjzi 


Mus musculus 


crumbs-like protein 1 precursor 


111 
III 


D L 


911 


gi60 14482 


Homo sapiens 


CRB1 


754 


30 


911 


gllOWDZOy 


Homo sapiens 


CRB1 isoform I precursor 


7^4 


ic\ 
3\) 


912 


gi6650802 


Homo sapiens 


PR01848 


205 


56 


912 


• O 1 1 f\ A A f A 

gi2 1104464 


Homo sapiens 


OK/SW-CL.41 


ion 

loo 


ol 




■11/1 Q1 A /?1 


Homo sapiens 


rKAJZoOZ 


17^ 
1 fj 


^4 


913 


gi6808611 


Homo sapiens 


88-kDa Golgi protein 


3237 


99 


913 


gi6969980 


Homo sapiens 


golgin 67 


2345 


98 


913 


gi72 11438 


Homo sapiens 


golgin-67 


2330 


98 


914 


gi307377 


Homo sapiens 


cAMP-dependent protein kinase Rl-beta 
regulatory subunit 


1957 


99 


914 


gi200365 


Mus musculus 


cAMP-dependent protein kinase 
regulatory subunit 


1886 


94 


914 


gi 15030299 


Mus musculus 


Similar to protein kinase, cAMP 


1881 


94 
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dependent regulatory, type I beta 






915 


gi20306468 


Mus musculus 


Similar to RIKEN cDNA 2610025P08 
gene 


382 


41 




gi716l798 


Homo sapiens 


dJ470B24.1.1 (myeloid/lymphoid or 
mixed-lineage leukemia (trithorax 
(Drosophila) homolog); translocated to, 4 
^Ar-oj tisoiorm i)) 


130 


32 


ylD 


c™71617Q7 

gi/iol /y / 


— : 

Homo sapiens 


A T/1 7AQ0/1 1 7 /tviim1/%i/l/ltmiri^(\i/i rvr* 

qj4/udZ4. l.z ^myeioioviympnoia or 

Tviiv*»H 1 ir»/^o erf* IfMil/^mio /triflinniY 

mixeu-iincagc lcuKcmia luiuiaA 

rPlrrtcnnViila^ Viomol r\crV trsiTiclopiatprl to 4 
^Uiuaupiiiia^ injinunjg^, ucuidiwatvu i\J t *t 

fAF-61 (isoform 2YI 


1 1CS 


52. 


916 


ei 1845577 


A/fn<» tniiQp.iiliiQ 

1V1UO lUUOvlUUC 


arach i don ate 1 2f S^-1 i r> ox vpen ase 


2633 


11 


916 


gi3645913 


Mus musculus 


12(S)-lipoxygenase 


2633 


11 


916 


ei 15489302 


IMhq mii<iPii1nQ 

1V1UO lllUoVUlUo 


Similar to nrapViirlnnatp 1 S-linnwirpnnQp 
Olilliicu lv aiawiiuuuaic; l.j*"lipvjA.jrgvilaot/ 




11 


917 


gil5489302 


Mus musculus 


Similar to arachidonate 15-lipoxygenase 


751 


78 


017 


oi!84^S77 


1Y1U£> lllUovUlUo 


tU aUillUUlldVw lZr^O y-llJJUA.jrgvllaoC 


74R 


78 


917 
y l / 


cnl 101886 


rrmcpiilnc 

1Y1UO IHUoV^UlUo 


arapViiHonntp linowtrpnacp 


748 

/ to 


78 
/ o 


918 


gil5489302 


Mus musculus 


Similar to arachidonate 15-lipoxygenase 


1266 


75 


Q1 R 

y to 


m 1 84*^77 
gllo*K)j / / 


Mus musculus 


arachidonate 12(S)~lipoxygenase 


17£7 


7^ 

/J 


918 


gi!101886 


Mus musculus 


arachidonate lipoxygenase 


1263 


75 




giioooiyo4 


Leishtnania 
nidjor 




IOC 

lUo 


71 


010 

yiy 


gl 1 / 1 jjOJy 


INOSIOC sp. rV^v^ 
7170 


wu-repeax protein 


0^ 


71 
Zl 


919 


gill 139242 


Homo sapiens 


meiotic recombination protein REC14 


93 


25 


090 


ai 1 7R677Q8 


L/rosopniid. 

m p 1 a n n 0a c t p r 
UlClallUgaalvl 




677 
OZ/ 


47 
HZ 


920 


oi24251 1 1 


L/H/iyUoLCllUlll 

discoideum 


7inA 


122 


28 


920 


at 641 058 


HomA canipric 

J.J.U111V/ oa^lClu 


non~tniicplp mvooiti R 

llUll**llIUoUlC llljriiolll 13 


1 1 o 


24 


071 
yz i 


01*8117681 


"Homo compile 

nuiiio bdpicns 


cytoKine-iiKc protein i / 


741 


64 


971 


en 1 77S 1 071 


nuniu od.pi ci lo 




74 


07 
yz 


921 


gill323101 


Saint Croix 

nvpr vinic 


VP4 


79 


32 


922 


ffi81 32683 


Hnrno <2anipnQ 

J- L\Jl ll\J OOUlwllO 


pvtnVinp-lilfP •nrAtpin (^17 


241 


64 


922 


ai 1 275 1 071 


Homo canipnc 

1 1U1UU oaJJldlb 


PNA<i-31 


74 


92 


922 




.^Inint fVoi y 
river vini^ 


V It 


79 


32 


923 


ei8132683 


Homo sanierK 


cvtolrine-lilce nrotein 7 


384 


73 


923 


gil2751073 


Homo sapiens 


PNAS-31 


74 


92 


923 


gi216168 


Bacteriophage 

Off 1 


promoter 3 protein 


56 


37 


924 


gi8132683 


Homo sapiens 


cytokine-like protein C17 


263 


98 


Q74 


mil A1067 

gll IHOUO/ 


Cams familiaris 


alpha-L-fucosidase 


60 
Qy 




924 


gi309444 


Mus musculus 


MRK 


58 


65 


925 


gi8132683 


Homo sapiens 


cytokine-like protein C17 


591 


100 


07^ 


m 14068 10 


f\ Jilt it* tniictmiMici 

ivius muscuius 


growth factor receptor 




OU 1 


925 


gil2724591 


Lactococcus 
lactis subsp. 
lactis 


UNKNOWN PROTEIN 


41 


37 


926 


gi 17975777 


Homo sapiens 


vesicular inhibitory amino acid 
transporter 


2741 


99 


926 


gi 133963 17 


Homo sapiens 


bA12201.1 (A novel protein (ortholog of 
the mouse vesicular inhibitory amino acid 
transporter, VIAAT)) 


2741 


99 
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926 


gi2587061 


Rattus 
norvegicus 


vesicular GABA transporter 


2694 


98 


927 


gi3097285 


Rattus 
norvefficu 4 * 


ZOG 


670 


39 


927 


gi802014 


Rattus 
norvefficiis 


preadipocyte factor 1 


665 


39 


927 


eil3365691 


Mus mu < 5cii1u , s 


dlk (Delta like"* 


o*fy 


**o 


928 


gi6624073 


Homo sapiens 


similar to hepatitis delta antigen 

intpnictirnr nrntpin A • cirmlar tr\ 
Uli^lavlliig piUlvlll l\ j MllllidX IU 

AAB05928 1 rPTD*ffl488^14^ 


1757 


93 


928 


gil488314 


Homo sapiens 


hepatitis delta antigen interacting protein 
A 


274 


45 


928 


gil6768374 


Drosophila 
mel anoffaster 


GM03282p 


359 


37 


929 


gi4337106 


Homo sapiens 


BAT4 


864 


98 


929 


h 14250638 


Homo sanien<; 


Similar tr\ FYKT A cpomAnt /~Tir 1 *7 linman 

ouiiiiai uj J-/IN/V ocgmcni, v^nr i /, numan 
D6S54E 




GO 

yo 


929 


gi3941733 


Mus musculus 


BAT4 


JO 1 


71 


930 


gi9759107 


Arabidopsis 
thaliana 


phosphate/phosphoenolpyruvate 
translocator nrotpin-likp 


289 


30 


930 


gi2 1536504 


Arabidopsis 
thaliana 


oh osnh ate/oh o<5ri li oen ol nvn i va fp 

translocator-like protein 


OA's 


97 


930 


gi8778643 


Arabidopsis 
thaliana 


F5011 25 


935 


90 


931 


gi5852981 


Homo sapiens 


cardiotrophin-like cytokine CLC 


1204 


99 


931 


gi6007641 


Homo saniens 


neurotronViin-1 /R-cpII ^rHmiilntiTny faptnt*-^ 


190,4 




931 


gil5277895 


Homo sapiens 


Similar to cardiotrophin-like cytokine; 

neurotronhin-1/R-rpll ^mutatinc fnrtnrJX 


1204 


99 


932 


gi22003732 


Homo sapiens 


MTLC 


853 


99 


932 


gi 18490933 


Homo sanien<! 


Similar to RTKFN rHNA 1 1 1 0000VKOA 
gene 


o**o 




932 


gi20453974 


Mus musculus 


MT-MC1 


71 R 


519 
oz 


933 


gi9958075 


Arabidoosis 
thaliana 


Piitativp mpthioninp aminnnpntiHncp 


/ J7 


53 


933 


gi 11320956 


Arabidopsis 
thaliana 


methionine aminopeptidase-like protein 


739 


53 


933 


gi21553973 


Arabidopsis 
thaliana 


methionyl aminopeptidase-like protein 


717 


52 


934 


gi4104963 


Rattus 
norvegicus 


neurexophilin 4 


1493 


90 


934 


gil336013 


Mus musculus 


neurexophilin 2 


327 


65 


934 


ei4105164 


Homo qjitmptiq 
j.aviu\j aajjiwiiD 


TiPiirF»Yr\r»ViiliTi 0 
ncm t/AUpLlliiil z> 




OD 


935 


eil5025812 


f 1 ostri c\t l im 

VslVJOtl JU1U111 

acetobut vl i cum 


ivLCLnyi*"d.i/Ccptiiig cncmoiaxis protein 
with HA MP domain 

Willi 1 LTXIVLL UVJUlalAl 


OJ 


JO 


935 


gil7224936 


Trypanosoma 
brucei 

v* wvy A 


corset-associated protein 15 


63 


31 


935 


gil5025892 


Clostridium 
acetobutylicum 


Ribosome-associated protein Y (PSrp-1) 


48 


38 


936 


gil6l97625 


Arabidopsis 
thaliana 


anaphase promoting complex subunit 1 1 


64 


32 


936 


gi 10834682 


Homo sapiens 


PP3958 


74 


46 


937 


gil9387136 


Homo sapiens 


PYRIN-containing APAFl-like protein 5 


874 


99 


937 


gi202806 


Rattus 
norvegicus 


vasopressin receptor 


561 


68 


937 


gi21410402 


Mus musculus 


expressed sequence AI504961 


532 


67 
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938 


gil 1321325 


Homo sapiens 


Lin-7b 


1030 


100 




•oni oil no 

gi2038H93 


Homo sapiens 


Lin-7b protein; likely ortholog of mouse 
LIN-7B; mammalian LIN-7 protein 2 


1030 


100 


938 


gi3885828 


Rattus 
norvegicus 


lin-7-A 


1019 


98 


939 


gi 14349 125 


Homo sapiens 


alpha2-glucosyltransferase 


738 


96 


939 


gi35 13451 


Rattus 
norvegicus 


potassium channel regulator I 


718 


93 


939 


gi2!7H799 


Drosophila 
melanogaster 


RH4430lp 


142 


32 


i\A f\ 

940 


gi 12803 183 


Homo sapiens 


polypyrimidine tract binding protein 
(heterogeneous nuclear ribonucleoprotein 
I) 


1527 


91 




gUZ.504 


— : 

Homo sapiens 


nuclear ribonucleoprotein 


1527 


91 


QAf\ 


glJ j / /Z 


Homo sapiens 


polypirimidine tract binding protein 


1527 


91 


CM 1 


gio/jzo5o 


Homo sapiens 
-— 


epidermal growth factor repeat containing 
protein 


3046 


99 


0/11 


gi louH-uyoi 


Mus musculus 


rUiiM 


OOA 

884 


51 




glO43UZ40 


Mus musculus 


nephronectin short isoform 


O O A 

884 


51 


942 


gi6752658 


Homo sapiens 


epidermal growth factor repeat containing 
protein 


3036 


98 


942 


gil6040981 


Mus musculus 


POEM 


884 


51 


>4z 


gUD4JUz4o 


Mus musculus 


nephronectin short isoform 


884 


51 




gii /youyoy 


Homo sapiens 


sel4-3r protein 


C t A x? 

5146 


99 


943 


gill385648 


Homo sapiens 


CTCL tumor antigen sel4-3 


3867 


99 




gi/youzio 


Homo sapiens 


RACK-like protein PRKCBPl 


3124 


99 


944 


gi!7980969 


Homo sapiens 


sel4-3r protein 


3140 


99 


944 


gil3677201 


Homo sapiens 


dJ569M23.1.2 (protein kinase C binding 
protein 1, isoform 2) 


2771 


100 


944 


gil3677198 


Homo sapiens 


dJ569M23.1.3 (protein kinase C binding 
protein 1, isoform 3 (DKFZp564P1772)) 


2638 


96 


945 


gil7980969 


Homo sapiens 


sei4-3r protein 


3550 


84 


945 


gi 13677201 


Homo sapiens 


dJ569M23.1.2 (protein kinase C binding 
protein 1, isoform 2) 


211 \ 


100 


945 


gil3677198 


Homo sapiens 


dJ569M23.1.3 (protem kinase C binding 
protem 1, isoform 3 (DKFZp564P1772)) 


2638 


96 


946 


gil7980969 


Homo sapiens 


sel4-3r protein 


3550 


84 


94o 


gl 13677 19o 


Homo sapiens 


dJ569M23, 1.3 (protem kinase C binding 
protein 1, isoform 3 (DKFZp564P1772)) 


2380 


90 


946 


gil3677201 


Homo sapiens 


dJ569M23.1.2 (protein kinase C binding 
protein I, isoform 2) 


2377 


90 


947 


gii4043211 


Homo sapiens 


Similar to RIKEN cDNA 4931428F04 
gene 


2410 


98 


947 


gi22204070 


Macaca mulatta 


metabotropic glutamate receptor 1 


91 


42 


947 


gi 170454 


Lycopersicon 
esculentum 


cell wall hydroxyproline-rich 
glycoprotein 


70 


39 


OAS 


gii4y /z/jj 


Streptococcus 

pneumoniae 

TIGR4 


alcohol dehydrogenase, zinc-containing 


51 


33 


948 


gi20152351 


Avian 
infectious 
bronchitis virus 


spike glycoprotein SI subunit 


68 


34 


948 


gi9658106 


Vibrio cholerae 


polyhydroxyalkanoic acid synthase 


67 


26 


949 


gil9387136 


Homo sapiens 


PYRIN-containing APAFl-like protein 5 


1738 


99 


949 


gi202806 


Rattus 


vasopressin receptor 


1037 


64 
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norvegicus 








949 


gi2l410402 


Mus musculus 


expressed sequence AI504961 


988 


63 


950 


gi3978472 


Rattus 
norvegicus 


potassium channel subunit 


5393 


88 


950 


gi20338417 


Gallus gallus 


potassium channel subunit 


4792 


88 


950 


gi7303760 


Drosophila 
raelanogaster 


CG12904-PA 


981 


62 


951 


gil8147612 


Homo sapiens 


metalloprotease disintegrin 


3535 


99 


951 


gi21908028 


Homo sapiens 


a disintegrin and metalloprotease domain 
33 


3535 


99 


951 


gi!3 157560 


Homo sapiens 


dJ964F7.1 (novel disintegrin and 
reprolysin metalloproteinase family 
protein) 


3078 


99 


952 


gil8606367 


Mus musculus 


RIKEN cDNA 4930570C03 gene 


715 


92 


952 


gi9971130 


Schizosaccharo 
myces pombe 


human downs syndrome critical region- 
like 


72 


31 


952 


gi5708224 


Rhodoblastus 
acidophilus 


LH2alpha5 


60 


31 


953 


gil5420879 


Mus musculus 


ankyrin repeat-containing SOCS box 
protein 10 


2053 


82 


953 


gi 18092200 


Homo sapiens 


ASB-10 


1909 


98 


953 


gil8031949 


Mus musculus 


SOCS box protein ASB-18 


816 


45 


954 


gi491284 


synthetic 
construct 


IFN-pseudo-omega 2 


799 


98 


954 


gi386800 


Homo sapiens 


interferon-alpha 


330 


72 


954 


gi490110 


Homo sapiens 


interferon-omega 1 


330 


72 


955 


gi9844580 


Homo sapiens 


dJl 153D9.4 (novel protein) 


623 


84 


955 


gi9844579 


Homo sapiens 


dJl 153D9.3 (novel protein) 


450 


97 


955 


gil5928971 


Homo sapiens 


Similar to neuronal thread protein 


430 


90 


956 


gil2804321 


Homo sapiens 


peroxisomal short-chain alcohol 
dehydrogenase 


685 


100 


956 


gil9H3668 


Homo sapiens 


NADP-dependent retinol dehydrogenase 
short isoform 


878 


100 


956 


gil 1559412 


Homo sapiens 


NADPH-dependent retinol 
dehydrogenase/reductase 


587 


100 


957 


gil2718818 


Mus musculus 


sulfhydryl oxidase 


496 


49 


957 


gil2718820 


Rattus 
norvegicus 


sulfhydryl oxidase 


489 


47 


957 


gil2483919 


Rattus 
norvegicus 


FAD-dependent sulfhydryl oxidase-2 


489 


47 


958 


gil2958660 


Homo sapiens 


acid phosphatase 


2252 


100 


958 


gi 12958663 


Homo sapiens 


acid phosphatase variant 3 


1285 


99 


958 


gi52871 


Mus musculus 


lysosomal acid phosphatase 


837 


45 


959 


gi28966 


Homo sapiens 


alpha 1-antitrypsin 


1703 


100 


959 


gi6855601 


Homo sapiens 


PRO0684 


1703 


100 


959 


gi 11493443 


Homo sapiens 


PRO2209 


1703 


100 


960 


gi28966 


Homo sapiens 


alpha 1-antitrypsin 


1080 


100 


960 


gil 1493443 


Homo sapiens 


PRO2209 


1080 


100 


960 


gil 77829 


Homo sapiens 


alpha-l-antitrypsin 


1080 


100 


961 


gi28966 


Homo sapiens 


alpha 1-antitrypsin 


1239 


100 


961 


gil 1493443 


Homo sapiens 


PRO2209 


1239 


100 


961 


gi 177829 


Homo sapiens 


alpha-l-antitrypsin 


1239 


100 


962 


gi28966 


Homo sapiens 


alpha 1-antitrypsin 


1574 


93 


962 


gil 1493443 


Homo sapiens 


PRO2209 


1574 


93 
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962 


gil77829 


Homo sapiens 


alpha- 1 -antitrypsin 


1574 


93 


963 


gi6706993 


Streptomyces 

coelicolor 

A3(2) 


methyltransferase 


83 


26 


yoj 


gi/3 03904 


Drosophila 
melanogaster 


CG13954-PA 


85 


53 


964 


gi2632092 


Pongo 
pygmaeus 


fertilin alpha protein 


4128 


92 


yo4 


'Hrt y| ATI 

gi794073 


Macaca 
fascicularis 


fertilin alpha-I 


3136 


74 


964 


gil841702 


Macaca 
fascicularis 


fertilin alpha-I isoform 


3136 


74 


yt)J 


gi4iu/22y 


Homo sapiens 


hpopmlin A 


454 


100 


yco 


gl41u/23l 


Homo sapiens 


lipophilin B 


267 


60 


yOj 


gll /so /J59 


Oryctolagus 
cuniculus 


lipophilin AL2 


248 


54 j 


yoo 


gl333MUU 


Homo sapiens 


CD39L3 


2816 


100 


966 


gi!3817037 


Homo sapiens 


E-type ATPase 


2812 


99 


yoo 


gi20988653 


Homo sapiens 


Similar to ectonucleoside triphosphate 
diphosphohydrolase 3 


2413 


99 


yo/ 


gloy42U9o 


Mus musculus 


CBLN3 


936 


93 


967 


gil80251 


Homo sapiens 


precerebellin 


549 


57 


yo/ 


gi570237l 


Mus musculus 


precerebellin-1 


542 


56 


yoo 


gll 7390957 


Mus musculus 


Similar to RIKEN cDNA 2010001E1 1 
gene 


129 


32 


yoo 


gllo41Uo3o | 


Listeria 

monocytogenes 


similar to multidrug-efflux transporter 


95 


27 


968 


gi49 14624 


Listeria 

monocytogenes 


multidrug resistance transporter 


95 


27 


yoy 


gll /Jyuyj/ 


Mus musculus 


Similar to RIKEN cDNA 2010001E1 1 
gene 


191 


26 


yoy 


glzozooUo 


Bacillus subtilis 


glucose transporter 


100 


23 


969 


gi 14023 148 


Mesorhizobium 
loti 


probable fosmidomycin resistance protein 


112 


25 


y /0 


glli lol 123 


Homo sapiens 


transcript Y 10 


151 


54 


y /u 


gl4D4jJ17 


Acipenser 
ruthenus 


immunoglobulin light chain precursor 


160 


25 


y i\j 


giyyj IDyy 


oaimo irutta 


MHC class I heavy chain 


160 


31 


971 


gi4160197 


Homo sapiens 
— 


dJ327J16.2 (supported by GENSCAN 
and GENEWISE) 


2515 


99 


071 1 
y / 1 




Rattus 
norvegicus 


neuronal pentraxin receptor 


2238 


89 


971 


gi 12744624 


Mus musculus 


neuronal pentraxin receptor 


2212 


88 


y/Z 


gl47 60782 


Mus musculus 


Ten-m4 


4188 


96 


972 


gi3170615 


Mus musculus 


DOC4 


4166 


96 


972 


gi53 07785 


Danio rerio 


ten-m4 


3537 


78 


973 


£i 14714932 




oinuiar to nuclear ractor ^erytnroia- 
derived 2)-like 1 


3770 


i r\r\ 

100 


973 


gi473090 


Mus musculus 


NFE2-related factor 1 


3644 


96 


973 


gi3978250 


Mus musculus 


Nrfl splice variant D 


3280 


96 


974 


gi7716100 


Rattus 
norvegicus 


selective LIM binding factor 


8413 


95 


974 


gi 17044301 


Leishmania 
major 


possible LIM-binding factor 


2139 


36 


974 


gil0440379 


Homo sapiens 


FLJ00025 protein 


135 


25 
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975 


gi20799661 


Mus musculus 


mucolipin-2 


1593 


72 


07 S 

y lo 


gi^uyo / jjj 


lviiis rnuscuius 


r»nMA T*fl AAA 9P A J. at*nt* 

KiivtsrN CL/iNA jjuuuuzv>u*t gene 


1 <oo 

ijyu 


/l 


975 


gil9072756 


Mus musculus 


mucolipin-3 


1136 


51 


Q7£ 

y /o 


gi/u/yyooi 


Mus musculus 


mucolipin-2 


2jy4 


83 


976 


gi20987535 


Mus musculus 


RIKEN cDNA 3300002C04 gene 


2391 


82 


y/o 


gllyO/2754 


Homo sapiens 


mucolipin-3 


1674 


59 


yfl 


gl4U3U2U 


Mus musculus 


En-2/lacZ fusion protein 


988 


96 


977 


gil4193747 


Mus musculus 


zinc finger 142 


258 


24 


977 


1 c i ai >n 

gu510147 


Homo sapiens 


similar to Human zinc finger 
protein(ZNF142) 


223 


20 


978 


gu0581238 


Halobacterium 
Sp. NRC-1 


Vngl783h 


54 


46 




giiyoyy2y4 


Arabidopsis 
tnaliana 


AT3g4875 0/T2 1 J 1 8_20 


73 


30 


979 


gi7959724 


Homo sapiens 


PRO0929 


63 


30 


mo 

y/y 


gl 13540242 


Anopheles 
stephensi 


NADH dehydrogenase subunit 5 


62 


31 


Q7Q 

y /y 


glZUyU4o4 / 


Methanosarcina 
mazei Goel 


— ; 

8-oxoguanme DNA glycosylase 


o4 


40 


you 


„;cOQi fin 


Homo sapiens 


H I KA serine protease 


2164 


1 AA 

100 


you 


gio iJUjy 


Homo sapiens 


serin protease with IGF-binding motif 


1 1 CA 

2164 


1 AA 

100 


980 


gil621244 


Homo sapiens 


novel serine protease, PRSS11 


2164 


100 


yoi 


gl/U08U2:> 


Callitnrix 
jacchus 


prochymosin 


oil 

832 


/f o 

68 


no 1 

yoi 


gll 985 1892 


Bos taurus 


chymosin precursor 


515 


77 


981 


gil62860 


Bos taurus 


preprochymosin b 


752 


62 


aoi 

982 


gl 18461371 


Rattus 
norvegicus 


sulfatase FP 


276 


68 


982 


gi21961489 


Mus musculus 


Similar to sulfatase FP 


276 


68 


y&2 


gl 15430244 


Cotumix 
coturnix 


N-acetylglucosamine-6-sulfatase 


263 


68 


yoj 


glJU4jO IL 


Lactococcus 
lactis 


— : 

transmembrane protem Tmp3 


cc\ 
69 


32 


yoi 


gl 17428881 


Ralstonia 
solanacearum 


CONSERVED HYPOTHETICAL 
rKU 1 iillN 


62 


34 


yoo 


gl4JJ /U/ 


Zea mays 


prolin rich protein 


CI 

63 


A 0 

48 


yo4 


glOUlJ40J 


Bothrops 
jararaca 


carboxypeptidase homolog 
~ 


ozo 


AC 

40 


yo*f 


giy3jo*t*fo 


Mus musculus 


carboxypeptidase R 


812 


4j 


984 


gi7416967 


Mus musculus 


thrombin-activatable fibrinolysis inhibitor 


812 


45 


y«sj 


gl0UlJ4Oo 


Bothrops 
jararaca 


carboxypeptidase homolog 


826 


AC 

46 


yoj 


giyDJo44o 


Mus musculus 


— — ■ 

carboxypeptidase R 


010 

812 


45 


yoj 


gi/4ioyo/ 


Mus musculus 


thrombin-activatable fibrinolysis inhibitor 


nil 

812 


A C 

45 


yoo 


mil SA^mi 
gll 1 D4j /U/ 


Homo sapiens 


ioCUz 


OA C 

845 


1 AA 

100 






K/fnc tnncniliic 
IVlUo liluMsUiUo 




R A 7 


QC 

yo 


986 


gil 1545705 


Homo sapiens 


ISCU1 


663 


99 


987 


gi 123 14022 


Homo sapiens 


(U553F4.4 (Novel protein similar to 
Drosophila CG8055 protein) 


881 


89 


987 


gi22417143 


Homo sapiens 


CGI-301 protein 


853 


100 


987 


gil3 182765 


Homo sapiens 


CDA04 


560 


60 


988 


gi52959 


Mus musculus 


precursor polypeptide (AA -26 to 108) 


146 


34 


988 


gil98922 


Mus musculus 


lymphocyte differentiation antigen 


145 


34 


988 


gil98926 


Mus musculus 


Ly-6A.2 alloantigen 


145 


34 
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m 1 CAAA/1QA 

giijyyu4oU 


Homo sapiens 


— — — - — : : 

Similar to AE-binding protein 2 


1570 


100 




•ill f\£A £A 

gl4 1 06464 


Mus musculus 


Ab-1 bindmg protein AEBP2 


1555 


98 




glz 1595036 


Mus musculus 


AE binding protein 2 


1555 


98 


991 


gi23903 


Homo sapiens 


63kDa protein kinase 


2897 


99 


A A 1 

991 


•Art A l\C O 

gi204058 


Rattus 
norvegicus 


extracellular signal-related kinase 3 


1499 


62 


991 


gi 16306437 


Homo sapiens 


ERK-3 


1492 


62 


992 


gtl7016967 


Homo sapiens 


NUANCE 


3403 


90 


992 


gil7861384 


Homo sapiens 


nesprin-2 gamma 


3403 


90 


992 


gi21 748548 


Homo sapiens 


FLJ00347 protein 


3403 


90 


993 


gi20070711 


Homo sapiens 


similar to RIKEN cDNA 2310044D20 


997 


100 


993 


gi 18204756 


Mus musculus 


Similar to RIKEN cDNA 2310044D20 
gene 


626 


68 


993 


gi7304139 


Drosophila 
melanogaster 


CG12159-PA 


111 


28 


OQ/f 


gll4Z/5yz7 


Mus musculus 


gliacolin 


866 


68 


yy4 


gl 105664/ 1 


Mus musculus 


Gliacolin 


866 


68 


QO/l 

994 


gi j /4 /oyy 


Mus musculus 


Clq-related factor 


734 


67 j 


995 


gi20987689 


Homo sapiens 


Similar to allantoicase 


1838 


99 


AAC 

995 


gil4718648 


Homo sapiens 


allantoicase 


1633 


99 


995 


gi9255889 


Mus musculus 


allantoicase 


1476 


77 


997 


gi2522208 


Homo sapiens 


Ras-GRF2 


6407 


99 


997 


gi5882290 


Homo sapiens 


Ras guanine nucleotide exchange factor 2 


6401 


99 


997 


gi57665 


Rattus rattus 


P140 RAS-GRF 


4121 


65 


998 


gi22038159 


Homo sapiens 


ziziminl 


8544 


100 


998 


gil4597976 


Homo sapiens 


human CLASP-4 


3533 


56 


998 


gi550420 


Rattus 
norvegicus 


trg 


2842 


87 


999 


gi 17861 850 


Drosophila 
melanogaster 


GM03763p 


334 


70 


999 


gil7862036 


Drosophila 
melanogaster 


LD05823p 


265 j 


47 


999 


gi 10178624 


Mus musculus 


SETA binding protein 1; SB1 


215 


45 


1000 


gi21 594273 


Homo sapiens 


S AC2 suppressor of actin mutations 2- 
like (yeast) 


3626 


100 


1000 


gi 1404 1697 


Homo sapiens 


dJ1033B10.5.1 (SAC2 (suppressor of 
actin mutations 2, yeast, homolog)-like 
(ARE1), isoform 1) 


3587 


99 


1000 


gi3850063 


Rattus 
norvegicus 


ARE1 


3576 


98 


1001 


gi!438534 


Rattus 
norvegicus 


rA9 


4002 


61 


1001 


gi 1438532 


Rattus 
norvegicus 


rAl 


430 


36 


1001 


gi9438033 


Homo sapiens 


ser/arg-rich pre-mRNA splicing factor 
SR-A1 


407 


35 


1002 


gil438534 


Rattus 
norvegicus 


rA9 


4002 


61 


1002 


gi9438033 


Homo sapiens 


ser/arg-rich pre-mRNA splicing factor 
SR-A1 


407 


35 


1002 


gil0440402 


Homo sapiens 


FLJ00034 protein 


407 


35 


1003 


gil675220 


Cricetulus 
griseus 


SREBP cleavage activating protein 


6200 


92 


1003 


gi20378357 


Drosophila 


ER-golgi escort protein 


810 


39 
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meianogasier 








1003 


gi 10728 147 


Drosophila 
rneianogaster 


CG8356-PA 


810 


39 


1004 


gil2652851 


Homo sapiens 


potassium channel modulatory factor 


1987 


100 


1004 


gl*fo JOJJ / 


Mus muse ul us 


udd 1 -y 1 


1453 


96 


1004 


«n*1 676 R 700 

giio /oo /yu 


Drosophila 
meianogaster 


LDUJj 15p 


876 


63 


1005 


gi7270532 


Arabidopsis 
inaiiana 


DNA-directed RNA polymerase (EC 
/. /. /.o; 11 largest chain 


173 


29 


ions 


oi16S0^ 


/vraDiaopsis 

trial iariQ 
lllalld.Ua 


kjna polymerase 11 


173 


29 


100S 


gl 1 U*T-7*T 


rilaOlQOpSlS 
lUallalla 


■ x - . — i 

DNA-directed RNA polymerase 


173 


29 


1006 


ml 1 875^1 8 


IV/fiic ttiiicpiiItic 
lYLUo llllibOUllib 


synapioiagrnin aiii 


2004 


Oft 

89 


1006 


gi21410154 


Mus musculus 


synaptotagmin 13 


2004 


89 


1006 


cn*1 1 1 109^0 


ivauus 

nvJiVCgU/US 


synaptotagmin 13 


2000 


89 


1007 


gi3800881 


Homo sapiens 


RanBP7/importin 7 


5447 


100 


1007 


oil 1 149*101 
gi l lO*rZjy 1 


Mus musculus 


Kaniir //lmportin / 


5418 


99 


1007 


gil 1544639 


Homo sapiens 


importin7 


5307 


100 


1008 


gijj /oyoo 


Homo sapiens 


CU475B7.2 (novel protein) 


3770 


99 


1008 


gil 8676522 


Homo sapiens 


FLJ00158 protein 


1512 


100 


1002 
lUUo 


r»J9i <ro<:i« 


Mus musculus 


Similar to RIKEN cDNA 5830482G23 
gene 


1151 


71 


1009 


gi4406393 


Bos taurus 


differentiation enhancing factor 1 


4699 


95 


1000 


tri406'361ii 


Mus musculus 


ADP-ribosylation factor-directed GTPase 
activating protein isoform a 


4694 


94 


1000 


oi40616 1 6 


— : 

Mus musculus 


AUr-noosylation tactor-directea GTPase 
activating protein isoform b 


3186 


79 


1010 


oil 641 1097 
gnu*t 1 kyz. 1 


jLrisieria 

liiunucyiugeneo 


imoz4jy 


57 


52 


1010 


gil6415055 


Listeria innocua 


lin2533 


61 


51 


1010 


gi2983786 


Aquifex 
aeolicus 


glucose- 1 -phosphate 
thymidylyltransferase 


70 


39 


101 1 


giyZOvPRQ 


Homo sapiens 


adlican 


1631 


47 


i m 1 




Homo sapiens 


fibuhn-6 


502- 


28 


1011 


gi3328186 


Caenorhabditis 
elegans 


hemicentin precursor 


539 


27 : 


i oi o 


m4001 60 C 


Sus scrofa 


mat-8 


67 


30 


1012 


gi2622724 


Methanothermo 
bacter 

thermautotrophi 
cus str. Delta H 


conserved protein 


82 


29 


i m o 
101/ 


gi4yoioo 


Mus musculus 


zona-pellucida-bindmg protein (sp38) 


85 


27 | 


1013 


gil7511816 


Homo sapiens 


Similar to RIKEN cDNA 1 1 10032022 
gene 


1468 


99 


1013 


ei7211438 




goigin-o / 


100 


30 


1013 


gi6003208 


Human 

immunodeficien 
cy virus type 1 


pl7 protein 


84 


29 


1014 


gil7511816 


Homo sapiens 


Similar to RIKEN cDNA 1 1 10032022 
gene 


878 


100 


1014 


gi6003208 


Human 

immunodeficien 
cy virus type 1 


pl7 protein 


84 


29 
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1014 


gi21957065 


Yersinia pestis 
KIM 


uroporphyrinogen III methylase 


90 


34 


101 s 

1VJ1 J 


m'99 A6A(\1 


Homo sapiens 


centrin 


842 


100 


1015 


gil3529248 


Homo sapiens 


centrin, EF-hand protein, 3 (CDC31 yeast 
nomolog) 


839 


99 


1015 


gi2246424 


Mus musculus 


centrin 


832 


98 


1 A.1 £ 


gll /4zo7oj 


Ralstonia 
solanacearum 


CONSERVED HYPOTHETICAL 
PROTEIN 


530 


43 


iv/lo 


'1 CI CC Ail i£ 

gll5l3394o 


Agrobacterium 
tumefaciens str. 
Oo (Cereon) 


AGR_C_1725p . 


379 


41 


1U10 




Sinorhizobium 
meliloti 


CONSERVED HYPOTHETICAL 

DDOTDTM 


372 


39 


1017 
lvi / 


m 174987^ 
gll / *fZO / O J 


Ralstonia 
soianacearum 


CUJNoEKVilD HYPOTHETICAL 

DDfYTBTXT 

rKUlcliN 


381 


43 


1017 


gllJU / J7 l J 


o iriurnizooi um 
meliloti 


pri\icni>\fnn uvd/^ttjit'tt/^ a r 
v^UJNoliKVjiL) HYrUlHbllCAL 

PROTEIN 


367 


48 


1017 


oil 9^411 IS 


^oryneoacien u 
in giuuiiiiiuuin 




265 


30 


1018 


ei6693701 


numo Sapiens 


meianopsm 


2234 


At 

91 


1018 


gi21928729 


Homo sapiens 


seven transmembrane helix receptor 


2190 


99 


101 8 


mfiAQ1701 


Mus musculus 


melanopsin 


1735 


73 


1019 


gi439296 


Homo sapiens 


garp 


822 


37 


101Q 


<rin"^ 79979 


Homo sapiens 


dJ756G23.1 (novel Leucine Rich Protein) 


243 


34 


1019 


gil9344010 


Homo sapiens 


insulin-like growth factor binding protein, 
acid labile subunit 


293 


29 


1020 


gil5706421 


Homo sapiens 


middle-chain acyl-CoA synthetasel 


1346 


99 


1090 


oi 1 ^4527109 
gllJ^fO fj\)Z 


Homo sapiens 


medium-chain acyl-CoA synthetase 


1346 


99 


1020 


gi5019275 


Bos taurus 


xenobiotic/medium-chain fatty acid:CoA 
ligase form XL-III 


1088 


78 


1021 


gi6650766 


Homo sapiens 


PDZ domain-containing guanine 
nucleotide exchange factor I 


6216 


100 


1021 


gi20386206 


Homo sapiens 


PDZ domain-containing guanine 
nucleotide exchange factor PDZ-GEF2 


5822 


98 


1091 
IVZ 1 


gl loo /4/UU 


— — — : 

Homo sapiens 


Rapl guanine nucleotide-exchange factor 


5803 


98 


1022 


gi20386206 


Homo sapiens 


PDZ domain-containing guanine 
nucleotide exchange factor PDZ-GEF2 


5942 


100 


1099 


oil 8874700 
gll 00 /*f/UU 


riomo sapiens 


Rapl guanine nucleotide-exchange factor 

DTY7 rjDDOD 


5923 


99 


1022 


ml 8874608 

too/ *ttjjr o 


numo sapiens 


Rapl guanine nucleotide-exchange factor 

PH7 nco a 
r LiZf- vjcr ZA 


5923 


99 ! 


1023 


<rt 138 10306 


numo sapiens 


transmembrane protein 7 


268 


37 


1023 


oil 89^0794 


ivius musculus 


transmembrane protein 7 


264 


37 


1023 


gi20270907 


Oncorhynchus 
my kiss 


VHS V-induced protein-5 


243 


33 


1024 


gi21779869 


Homo sapiens 


IL-17RE 


9RQn" 


inn 


1024 


gi21779866 


Mus musculus 


IL-17RE 


1394 


74 


1024 


gi21 779857 


Homo sapiens 


IL-17RC 


246 


29 


1025 


gi21779869 


Homo sapiens 


IL-17RE 


2928 


100 


1025 


gi21779866 


Mus musculus 


IL-17RE 


1388 


75 


1025 


gi21779857 


Homo sapiens 


IL-17RC 


246 


29 


1026 


gi 14 150450 


Rattus 
norvegicus 


UDP-GalNAc:polypeptide N- 
acetylgalactosaminyltransferase T9 


1352 


93 


1026 


gi!6769916 


Drosophila | SD10722p 


473 


38 
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melanogaster 










glzloZ/ Wj 


Drosophila 
melanogaster 


0030463 -PA 


417 


38 




gUJZl /uo/ 


Homo sapiens 


. 

stem cell factor isoform 1 


1013 


95 


1027 


gi337934 


Homo sapiens 


stem cell factor 


1013 


95 


IUZ / 


glloz/47/ 


Felis catus 


stem cell factor 


893 


84 


1028 


gil377894 


Homo sapiens 


OB-cadherin-1 


1478 


64 


1UZ5 


gil377895 


Homo sapiens 


OB-cadherin-2 


1478 


64 


1028 


gi506404 


Homo sapiens 


cadherin-11 


1474 


63 


1029 


gi 13 77894 


Homo sapiens 


OB-cadherin-1 


1628 


56 


1029 


gil377895 


Homo sapiens 


OB-cadherin-2 


1628 


56 


1029 


gi506404 


Homo sapiens 


cadherin-11 


1623 


56 


1030 


gil398903 


Mus musculus 


Ca2+ dependent activator protein for 
secretion 


6314 


90 


1030 


gi577428 


Rattus 
norvegicus 


Ca2-Kdependent activator protein; 
calcium-dependent actin-binding protein 


5003 


96 


1030 


gi6980012 


Drosophila 
melanogaster 


secretion calcium-dependent activator 
protein 


3540 


60 


1031 


gi2 17705 


Sus scrofa 


dipeptidase precursor 


781 


51 


1031 


gi2102 


Sus scrofa 


dipeptidase 


781 


51 


1031 


gi8248922 


Homo sapiens 


renal dipeptidase; RDP 


762 


50 


1032 


gi 18073362 


Homo sapiens 


cystine/glutamate transporter 


2552 


100 


1032 


gil 1493652 


Homo sapiens 


calcium channel blocker resistance 
protein CCBR1 


2552 


100 


1032 


gi 13924720 


Homo sapiens 


cystine/glutamate transporter xCT 


2552 


100 


1033 


gil 7028348 


Homo sapiens 


Similar to methylenetetrahydrofolate 
dehydrogenase (NADP+ dependent), 
methenyltetrahydrofolate cyclohydrolase, 
formyltetrahydrofolate synthetase 


3748 


100 


1033 


gi20987924 


Mus musculus 


Similar to DKFZP586G15 17 protein 


3473 


92 


1033 


gi307178 


Homo sapiens 


MDMCSF (EC 1,5.1.5; EC 3.5.4.9; EC 
6.3.4.3) 


2839 


62 


1034 


gi632676 


Saccharomyces 
cerevisiae 


Ylr410wp 


598 


44 


1034 


gi4070 


Saccharomyces 
cerevisiae 


nufl 


120 


20 


1034 


gi312175 


Saccharomyces 
cerevisiae 


SPC110/NUF1 


120 


20 


1035 


gil 1066463 


Rattus 
norvegicus 


RhoGEF glutamate transport modulator 
GTRAP48 


5589 


80 


1035 


gil9387126 


Mus musculus 


guanine nucleotide exchange factor 


1794 


37 


1035 


gi71 10160 


Homo sapiens 


guanine nucleotide exchange factor 


1792 


37 


1036 


gi2921821 


Rattus 
norvegicus 


cytochrome P450 IIE1 


68 


28 


1036 


gi85 15399 


Human 
rcbpiiaLory 
syncytial virus 


attachment glycoprotein G 


64 


29 


1036 


gi5901834 


Drosophila 
melanogaster 


BcDNA.GH09358 


95 


23 


1037 


gil7128288 


synthetic 
construct 


Primer 1 


1689 


100 


1037 


gi20269957 


Sus scrofa 


phospholipase C delta 4 


1469 


85 


1037 


gi21307610 


Mus musculus 


phospholipase C delta 4 f 


1327 


77 


1038 


gi6978948 


Homo sapiens 


vaccinia related kinase 3 


76 


24 
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1038 


gi349667 


Carnobacterium 
piscicola 


carnobacteriocin A 


60 


41 


1038 


gi406315 


Carnobacterium 
piscicola 


piscicolin 61 


60 


41 


1039 


gi4159884 


Homo sapiens 


similar to mouse olfactory receptor 13; 
similar to P34984 (PID:g464305) 


1597 


99 


1039 


gi9368991 


Homo sapiens 


dJ1005Hl 1.1 (7 TRANSMEMBRANE 
RECEPTOR (RHODOPSIN FAMILY) 
(OLFACTORY RECEPTOR LIKE) 
PROTEIN)) 


1410 


100 


1039 


gil8480186 


Mus musculus 


olfactory receptor MOR261-6 


1323 


81 


1040 


gi3 11626 


Homo sapiens 


thrombospondin-4 


4787 


99 


1040 


gi3860231 


Mus musculus 


thrombospondin-4 


4557 


93 


1040 


gi929835 


Rattus 
norvegicus 


thrombospondin-4 


4547 


93 


1041 


gi 14043083 


Homo sapiens 


sperm associated antigen 9 


660 


100 


1041 


gi31 16015 


Homo sapiens 


sperm specific protein 


273 


98 


1041 


gil080H48 


Mus musculus 


JNK/SAPK-associated protein 1 


98 


41 


1042 


gi21654741 


Homo sapiens 


peptide/histidine transporter 


1746 


98 


1042 


gi2208839 


Rattus 
norvegicus 


peptide/histidine transporter 


1469 


79 


1042 


gi 16740719 


Mus musculus 


Similar to peptide transporter 3 


1453 


83 


1043 


gi21392228 


Drosophila 
melanogaster 


RH61354p 


1221 


41 


1043 


gi 19353264 


Homo sapiens 


Similar to dishevelled associated activator 
of morphogenesis 2 


2224 


65 


1043 


gi2947238 


Homo sapiens 


diaphanous 1 


717 


32 


1044 


gil5929979 


Homo sapiens 


Similar to zinc finger protein 345 


2476 


100 


1044 


gil8643896 


Homo sapiens 


zinc finger protein 


1656 


53 | 


1044 


gil020145 


Homo sapiens 


DNA binding protein 


1656 


53 


1045 


gil2655913 


Homo sapiens 


sprouty-4A 


386 


98 


1045 


gi4850326 


Mus musculus 


sprouty-4 


323 


81 


1045 


gi59 17720 


Mus musculus 


sprouty 4 


323 


81 


1046 


gi4539525 


Homo sapiens 


NAALADase II protein 


3881 


100 


1046 


gi3211746 


Sus scrofa 


folylpoly-gamma-glutamate 
carboxypeptidase 


2824 


70 


1046 


gi2897946 


Homo sapiens 


prostate-specific membrane antigen 


2787 


69 


1047 


gi5420389 


Leishmania 
major 


proteophosphoglycan 


139 


23 


1047 


gi915207 


Sus scrofa 


gastric mucin 


123 


22 


1047 


gii3592175 


Leishmania 
major 


PPg3 


125 


23 


1048 


gi5918167 


Homo sapiens 


plexin-Bl/SEP receptor 


2104 


54 


1048 


gi60 10211 


Homo sapiens 


semaphorin receptor 


2103 


54 


1048 


gil655432 


Mus musculus 


plexin 2 


1517 


30 


1049 


gi 159905 15 


Homo sapiens 


Similar to RIKEN cDNA 0610020102 
gene 


3035 


100 


1049 


gil8380977 


Mus musculus 


RIKEN cDNA 0610020102 gene 


2792 


92 


1049 


gi2384732 


Rattus 
norvegicus 


NAC-1 protein 


1269 


57 


1050 


gi 15088540 


Homo sapiens 


sterolin-2 


3127 


99 


1050 


gi 11692802 


Homo sapiens 


ABCG8 


3123 


99 


1050 


gi 15 146444 


Homo sapiens 


sterolin-2 


3120 


99 


1051 


gil2652851 


Homo sapiens 


potassium channel modulatory factor 


1987 


100 



WO 2004/080148 



PCT/US2003/030720 



170 
TABLE 2 A 



OXLt\£ 

AMJf 


Jtlil HI 


Species 


Description 


S 

score 


Percentage 
identity 


l\JJ 1 




mus muscuius 


TYERT 01 


1453 


96 


lw i 


cri 1 67fiR7Qfi 
gliO/Oo 


urosopruia 
melanogaster 


LfJJUoDlJp 


876 


63 




glJJ / jU 


Homo sapiens 


immunoglobulin lambda light chain 


716 


71 






Homo sapiens 


lambda-chain precursor (AA -20 to 215) 


703 


70 




goo 


Homo sapiens 


immunoglobulin lambda light chain 


697 


68 


1053 


gi21388773 


Homo sapiens 


kringle-containing protein 


1552 


100 






Homo sapiens 


kringle-containing transmembrane protein 


1238 


99 


i\Jj 3 


glZlJoo//3 


Homo sapiens 


kringle-containing protein 


1241 


100 


1054 


gil4495324 


Homo sapiens 


CMRF35A 


421 


48 


1 A^/t 
1UD4 




Homo sapiens 


n\/TlT71f t 1 j • 111* t 4 i 

CMRF35 leukocyte immunoglobulin-hke 
receptor 


All 


48 


1054 


gi396170 


Homo sapiens 


CMRF-35 antigen 


421 


48 


1 ACC 

1055 


gi4468256 


Homo sapiens 


MHC class I antigen 


1974 


100 


1055 


gi32139 


Homo sapiens 


HLA-A1 IE protein precursor (AA -24 to 
341) 


1912 


97 


1U05 


gi4o /yuy 


Homo sapiens 


T FT A l 1 1 „ i_' All i 

HLA-A1 1 antigen Al 1 . 1 


1912 


97 


1056 


gi21667214 


Homo sapiens 


bactericidal/permeability-increasing 
protein-like 3 


741 


100 


1056 


gi57732 


Rattus rattus 


potential ligand-binding protein 


215 


35 


i net 

1056 


gill 877276 


Homo sapiens 


dJ726C3.5 (ortholog of potential 
ligandjrinding protein RY2G5 (Rat)) 


176 


32 


1057 


gi21667214 


Homo sapiens 


bactericidal/permeability-increasing 
protein-like 3 


2226 


99 


1057 


gi57732 


Rattus rattus 


potential ligand-binding protein 


579 


32 


1057 


gi 11877276 


Homo sapiens 


(JJ726C3.5 (ortholog of potential 
ligand_bmding protein RY2G5 (Rat)) 


540 


31 


1058 


gi21667214 


Homo sapiens 


bactericidal/permeability-increasing 
protein-like 3 


1919 


99 


J.VJO 


glj / /jZ 


JKattus rattus 


potential ligand-binding protein 


485 


33 




glllo//Z/o 


Homo sapiens 


dJ726C3.5 (ortholog of potential 
ligand-binding protein RY2G5 (Rat)) 


447 


31 




glZ100/Z14 


Homo sapiens 


bactericidal/permeabuity-increasuig 
protein-like 3 


1842 


100 




gl5 / 151 


Rattus rattus 


potential ligand-binding protein 


485 


33 


1059 


gil 1877276 


Homo sapiens j 


dJ726C3.5 (ortholog of potential 
ligand_binding protein RY2G5 (Rat)) 


447 


31 


1060 


gi23911 


Homo sapiens 


polypeptide 7B2 precursor 


1148 


100 


1060 


gi7718079 ' 


Homo sapiens 


neuroendocrine protein 7B2 


1148 


100 


1060 


gil3529158 


Homo sapiens 


secretory granule, neuroendocrine protein 
i (7B2 protein) 


1131 


99 


1 AiC 1 

106 1 


gi 18698601 


Homo sapiens 


Smith-Magenis syndrome chromosome 
region candidate 7 protein 


2325 


100 


1 f\A 1 
WO I 


gll50/j/52 


Sinorhizobium 
meliloti 


HYPOTHETICAL TRANSMEMBRANE 
SIGNAL PEPTIDE PROTEIN 


90 


29 


1061 


m 13623063 


pyogenes Ml 
GAS 


iicdx bnocK. proiein - cocnaperorun 


/U 


51 


1062 


gi4128041 


Homo sapiens 


claudin-9 protein 


1116 


100 


1062 


gi4325296 


Mus muscuius 


claudin-9 


1078 


95 


1062 


gi 14286272 


Homo sapiens 


claudin 6 


826 


71 ' 


1063 


gi 14286258 


Homo sapiens 


ribosomal protein L29 


432 


65 


1063 


gil215742 


Homo sapiens 


HIP i 


432 


65 


1063 


gi793843 


Homo sapiens 


ribosomal protein L29 


432 


65 
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1U04 


giooOl 55 5 


Rattus 
norvegicus 


glutamate receptor interacting protein 2 


3560 


86 


IUO**- 


gijoiyu// 


Rattus 
norvegicus 


— -— ; ; ; . 

AMPA receptor binding protein 


2743 


88 




glloyUoDO 


Rattus 
norvegicus 


— - — . . 

AMPA receptor interacting protein GRIP 


1925 


59 


1UUJ 


gloZooojZ 


riomo sapiens 


aisaoiea-i 


ZOOj 


99 


1UUJ 


oi17719R9 
gll / / 1Z5Z 


ivius muscuius 


mDab555 protein 


77Q7 
Z/9 / 


90 


IUOj 


m99fl05'*17 

gizzuyj j i / 


Gallus gallus 


disabled- 1 


2oJU 


90 


IvDO 


trilflfi9<C97 
gloUUZ jZ / 


xiomo sapiens 


neuronal tnreaa protein AD / c-in l r 


1 tLA 

lc>4 


00 


1066 


gi4336401 


Homo sapiens 


beta glucuronidase isoform d 


127 


72 


1UOO 


«AA11£.A A7 


Homo sapiens 


beta glucuronidase isoform c 


127 


72 


1UO/ 




Homo sapiens 


testis specific serine/threonine kinase 2 


1 o c o 

1858 


99 


1 A£7 


giz/jooys 


Mus muscuius 


protein kinase 


1686 


89 


IUO/ 




Homo sapiens 


testis-specific serine/threonine kinase 1 


1230 


77 


AUOo 




Homo sapiens 


prostaglandin D2 synthase (21kD, brain) 


977 


96 


1068 


gil2963879 


Homo sapiens 


prostaglandin D synthase 


977 


96 


lUOo 


giloy //z 


Homo sapiens 


prostaglandin D2 synthase 


977 


96 


1069 


gil3279311 


Homo sapiens 


Similar to RIKEN cDNA 1500017E18 
gene 


1416 


96 




gll4JJo/lo 


Homo sapiens 


— f ?< - - a _ TJ A OTT 

similar to HAGH 


1157 


100 


1069 


gi20988885 


Mus muscuius 


RIKEN cDNA 1500017E18 gene 


1151 


79 


1070 


gii3397835 


Homo sapiens 


annexin A13 isoform b 


1795 


99 


1070 


gi757784 


Canis familiaris 


annexin Xlllb 


1621 


89 


1070 


gi2l2l8387 


Oryctolagus 
cuniculus 


annexin Xlllb 


1589 


88 


1071 


gi2 1707908 


Homo sapiens 


solute carrier family 6 (neurotransmitter 
transporter, GAB A), member 1 


3129 


98 


1071 


gi31658 


Homo sapiens 


GABA transporter 


3114 


98 


1071 


gi204222 


Rattus 
norvegicus 


GAB A transporter protein 


3097 


96 


1072 


gi7 160975 


Homo sapiens 


voltage-gated sodium channel beta-3 
subunit 


834 


100 


1072 


gi716l889 


Rattus 
norvegicus 


voltage-gated sodium channel beta-3 
subunit 


823 


98 


1072 


gil4165176 ; 


Rattus 
norvegicus 


sodium channel beta 3 subunit 


823 


98 


1074 


gi!8676470 


Homo sapiens 


FLJ00132 protein 


2515 


99 


1074 


gi2 1430928 


Drosophila 
melanogaster ! 


SD27341p 


324 


38 


1074 


gi20 197056 


Arabidopsis 
thaliana 


expressed protein 


206 


29 


1075 


gi452751 


Gallus gallus 


Gal beta 1,4 GlcNAc alpha 2,6- 
sialyltransferase 


949 


54 


1075 


gi2295223 


unidentified 


GALACTOSYLTRANSFERASE- 
SIALYLTRANSFERASE HYBRID 

T)T> rVTTSTKX 

rKUlilllN 


856 


48 


1075 


gi29434 


Homo sapiens 


beta-galactoside alpha-2,6- 
sialyl transferase 


856 


48 


1076 


gi!3344997 


Homo sapiens 


Cat Eye Syndrome critical region protein 
isoform 2 


2223 


100 


1076 


gil3344995 


Homo sapiens 


Cat Eye Syndrome critical region protein 
isoform 1 


2002 


99 


1076 


gi!5928451 


Mus muscuius 


Similar to cat eye syndrome chromosome 
region, candidate 5 


1649 


76 
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IU/ / 


rrk 1 11A ACkClH 


Homo sapiens 


Cat Eye Syndrome critical region protein 
isoform 2 


1662 


96 




gll jjHHyyj 


— — ; 

Homo sapiens 


Cat Eye Syndrome critical region protein 
isoiorm i 


1662 


96 


1077 

IV// / 


cn"1S098ilS1 


ivius muscuius 


Similar to cat eye syndrome chromosome 
region, cdnaiuaie d 


1294 


75 


1078 


m 1 77870 


nomo sapiens 


aipna-z-macrogioouun precursor 


1*7 1 A 

2714 


*> A 

39 


1078 




numo Sapiens 


dipxid z.-iudLrogioouiin oy\r~/ okj 


z/Uo 


1 A 


1078 


aiS70S04 
gw / yjy*t 


nomo sapiens 


aipna z-macrogioounn oyu- /4U 


2700 


39 


1070 


glU / 1 OU*t 


vJd.ll US goilUS 


ovomacrogioDUiin, ovostaiin 


1300 


34 


1070 
iv/ /y 


<ri^70S0/l 


tiomo sapiens 


aipna z-macrogiooiuin oy\j-/*i\j 


1 1 At 

1297 


35 


1070 
lv iy 


oi 1 77870 


nomo sapiens 


alpha-2-macroglobulin precursor 


1296 


35 


1080 


m67186S 
glO/ lOOJ 


vjaiius gan us 


ovomacroglobulin, ovostatin 


806 


32 


1080 


<ri 177870 
gll / /O /U 


tiomo sapiens 


alpha-2-macroglobuiin precursor 


769 


31 


1080 


gi579592 


Homo sapiens 


alpha 2-macroglobulin 690-730 


769 


31 


1 081 

IU0 1 


<ri 1775270 
gll / /0 /U 


Homo sapiens 


alpha-2-macroglobulin precursor 


2732 


40 


1081 

1UO l 


g!3 lyjy/. 


Homo sapiens 


aipna z-macroglobulm 690-730 


2726 


40 


1081 


gi579594 


Homo sapiens 


alpha 2-macroglobulin 690-740 


2718 


39 


1 fkCI 


gij /5oy4 


Homo sapiens 


alpha 2-macroglobulin 690-740 


1297 


35 


1089 
1U0Z 


gll / /o/U 


Homo sapiens 


alpha-2-macroglobulin precursor 


1296 


35 


1082 


gi579592 


Homo sapiens 


alpha 2-macroglobulin 690-730 


1296 


35 


lUoo 


gl4U4jo9 


Mus sp. 


carboxylesterase; Es-male 


2006 


66 


Woo 


glZl J101 


Anas 

platyrhynchos 


thioesterase B 


1261 


46 


1083 


gi2058318 


Homo sapiens 


carboxylesterase 


1253 


47 


1 HQ/1 


glZU /zoo 


Rattus 
norvegicus 


TGF-beta masking protein large subunit 


8731 


89 


lOSzl 
lU0*t 


2134:7 j I/O 


Mus muscuius 


latent 1 Or beta binding protein 


8640 


88 




guyyuyizo 


Homo sapiens 


transforming growth factor-beta binding 
protein- 1 S 


7763 


99 


lvOJ 


oil 708^71 
gi l /yoJJ 1 1 


nomo sapiens 


13 binding protein 


ool 


100 


108S 


cn71061790 


nomo sapiens 


BRI3 binding protein 


ool 


1 AA 

100 


108S 

lvOJ 


ail 84.66808 

gl L OHOUOvO 


nomo sapiens 


cervical cancer 1 proto-oncogene-binding 
proicin ivu l y 


oj3 


AA 

99 


1086 


oi9998^^ 


VJoilUS gall US 


ivi-protein 


*>Or*i 


/to 
4Z 


1086 

ivou 


iri407007 
gi*t\/ / \/ y i 


nomo sapiens 


iojkjl/ protein 


ion 


A 1 

4Z 


1086 


gi2950347 


Mus muscuius 


M-protein 


2931 


42 


1087 
iuo / 


gllZOJ J lOJ 


nomo sapiens 


zinc finger protein 256 


696 


65 


1087 
1UO / 


gl40y4J04 


nomo sapiens 


zinc finger protein 3 


696 


65 


1 087 


m"91 **979Q£ 

gizi jz/zyo 


Homo sapiens 


zinc finger protein 382 


495 


46 


1088 


gi2689441 


Homo sapiens 


F18547„l 


188 


37 


1 nee 


gli0ljo4o 


Homo sapiens 


zinc finger protein zfp6 


316 


49 


lUoo 


gizi jz/zyo 


Homo sapiens 


zinc finger protein 382 


203 


38 


lUoV 


gllZOJj4oU 


Homo sapiens 


keratin associated protein 4. 12 


929 


75 


lUoy 


gl 132/0 825 


Homo sapiens 


Similar to RIKEN cDNA 1 1 10054P19 
gene 


929 


75 


1089 


gil2655464 


Homo sapiens 


keratin associated protein 4. 15 


900 


83 


1090 


gil2655460 


Homo sapiens 


keratin associated protein 4. 12 


403 


85 


1090 


gil3278825 


Homo sapiens 


Similar to RIKEN cDNA 1 1 10054P19 
gene 


403 


85 


1090 


gil2655442 


Homo sapiens 


keratin associated protein 4.2 


397 


84 


1091 


gi 12655464 


Homo sapiens 


keratin associated protein 4. 15 


1260 


100 


1091 


gi 12655452 


Homo sapiens 


keratin associated protein 4.7 


1222 


90 


1091 


gil2655460 


Homo sapiens 


keratin associated protein 4.12 


1156 


88 
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1092 


gil5722084 


Homo sapiens 


OA304I5.1 (novel lipase) 


1991 


100 


1092 


gi21594466 


Mus musculus 


RIKEN cDNA 4632427C23 gene 


1928 


87 


1092 


gi460143 


Homo sapiens 


lysosomal acid lipase/cholesteryl ester 
hydrolase 


1290 


60 


1093 


gi21594466 


Mus musculus 


RIKEN cDNA 4632427C23 gene 


1957 


88 


1093 


gi!5722084 


Homo sapiens 


bA304I5.1 (novel lipase) 


1935 


100 


1093 


gi460143 


Homo sapiens 


lysosomal acid lipase/cholesteryl ester 
hydrolase 


1290 


60 


1094 


gi8118040 


Homo sapiens 


orphan G-protein coupled receptor 


1804 


99 


1094 


gi8118052 


Mus musculus 


orphan G-protein coupled receptor 


1306 


82 


1094 


gil3177796 


Homo sapiens 


retinoic acid induced 3 


728 


45 


1095 


gil8129609 


Homo sapiens 


diacylglycerol acyltransferase 2 


600 


49 


1095 


gil5099951 


Mus musculus 


diacylglycerol acyltransferase 2 


599 


49 


1095 


gil7426446 


Homo sapiens 


bA351K23.5 (novel protein) 


572 


54 


1096 


gil7225337 


Homo sapiens 


dendritic lectin 


1134 


95 


1096 


gil7224598 


Homo sapiens 


blood dendritic cell antigen 2 protein 


1134 


95 


1096 


gil7225339 


Homo sapiens 


dendritic lectin b isoform 


918 


94 


1097 


gi 17225337 


Homo sapiens 


dendritic lectin 


1182 


99 


1097 


gil7224598 


Homo sapiens 


blood dendritic cell antigen 2 protein 


1182 


99 


1097 


gil7225339 


Homo sapiens 


dendritic lectin b isoform 


966 


99 


1098 


gi21929119 


Homo sapiens 


seven transmembrane helix receptor 


1595 


100 


1098 


gi 18479834 


Mus musculus 


olfactory receptor MOR144-1 


1223 


77 


1098 


gil8480806 


Mus musculus 


olfactory receptor MOR143- 1 


1163 


70 


1099 


gi5911169 


Homo sapiens 


transmembrane mucin 12 


3049 


99 


1099 


gil9526645 


Homo sapiens 


intestinal membrane mucin MUC17 


815 


32 


1099 


gi5911171 


Homo sapiens 


mucin 1 1 


684 


47 


1100 


gi37198 


Homo sapiens 


TM1-CEA preprotein 


455 


34 


1100 


gil79440 


Homo sapiens 


biliary glycoprotein I precursor 


455 


34 


1100 


gi550031 


Homo sapiens 


BGPc 


455 


34 


1101 


gi6273399 


Homo sapiens 


melanoma-associated antigen MG50 


4733 


60 


1101 


gi 1504040 


Homo sapiens 


similar to D.melanogaster 
peroxidasin(U11052) 


4733 


60 


1101 


gi53l385 


Drosophila 
melanogaster 


peroxidasin precursor 


2013 


39 


1102 


gi6273399 


Homo sapiens 


melanoma-associated antigen MG50 


4458 


60 


1102 


gil504040 


Homo sapiens 


similar to D.melanogaster 
peroxidasin(Ul 1052) 


4458 


60 


1102 


gi531385 


Drosophila 
melanogaster 


peroxidasin precursor 


2013 


39 


1103 


gi7264653 


Mus musculus 


Kiaa0575 


2398 


61 


1103 


gii 1611734 


Homo sapiens 


GREBla 


513 


46 


1103 


gi915208 


Sus scrofa 


gastric mucin 


128 


30 


1104 


gi202 19008 


Chlamydomona 
s reinhardtii 


coiled-coil flagellar protein 


682 


36 


1104 


gil6519041 


Drosophila 
melanogaster 


occludin-like protein 


203 


23 


1104 


gi3549261 


Dictyostelium 
discoideum 


interaptin 


175 


22 


1105 


gil2654511 | 


Homo sapiens 


ATP-dependant interferon response 
protein 1 


693 


96 


1105 


gil7390689 


Homo sapiens 


ATP-dependant interferon responsive 


693 


96 


1105 


gii0862826 


Homo sapiens 


ADIR1 


689 


95 


1106 


gil5215375 


Homo sapiens 


RNA binding motif protein 1 2 


325 


72 


1106 


gi21666372 


Homo sapiens 


swan 


325 


72 
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1106 


gil9070194 


Homo sapiens 


SWAN 


325 


72 


1107 


gil8157547 


Mus musculus 


1*1 "4 

pecanex-hke 3 


3262 


97 


1107 


gi6650377 


Mus musculus 


pecanex 1 


2530 


74 


1107 


gi 15076843 


Homo sapiens 


pecanex-like protein 1 


2526 


74 


1108 


gi 18 157547 


Mus musculus 


pecanex-like 3 


3138 


97 


1108 


gi6650377 


Mus musculus 


pecanex 1 


2409 


73 


1108 


gil5076843 


Homo sapiens 


pecanex-like protein 1 


2405 


73 


1109 


gi7770237 


Homo sapiens 


PR02822 


233 


59 


1109 


gi21595759 


Homo sapiens 


similar to HC6 


211 


71 


1109 


gi3002527 


Homo sapiens 


neuronal thread protein AD7c-NTP 


209 


67 


1110 


gil8159337 


Pyrobaculum 
aerophilum 


paREP8 


77 


30 


1110 


gil658310 


Homo sapiens 


leukocyte surface protein 


97 


26 


1110 


gi7638235 


Mus musculus 


immunoglobulin heavy chain variable 
domain 


77 


25 


1111 


gi4263743 


Homo sapiens 


similar to UNC-93; similar to U89424 
(PID:g3642687) 


1575 


100 


1111 


gi 12043567 


Homo sapiens 


unc-93 related protein 


1571 


99 


1111 


gil7390915 


Mus musculus 


Similar to unc93 (C.elegans) homolog B 


1372 


87 


1113 


gi4153873 


Homo sapiens 


similar to weel-like protein kinase; 
similar to P30291 (PID:gl351419) 


2810 


100 


1113 


gi644770 


Xenopus laevis 


WeelA kinase 


1166 


64 


1113 


gi2827996 


Xenopus laevis 


weel homolog 


1166 


64 


1114 


gi6606119 


Dothidea 
insculpta 


DNA-dependent RNA polymerase II 
RPB140 


81 


32 


1114 


gi2796053 


Mus musculus 


T cell receptor beta chain 


54 


48 


1115 


gi20372871 


Clarkia simiiis 


cytosolic phosphoglucose isomerase 


56 


28 


1116 


gi21708029 


Homo sapiens 


similar to Alu subfamily SQ sequence 
contamination warning entry 


135 


70 


1116 


gi 11493409 


Homo sapiens 


PRO0898 


129 


59 


1116 


gi6650818 


Homo sapiens 


PR01992 


110 


70 


1117 


gil3810898 


Rattus 
norvegicus 


inhibin binding protein long isoform 


310 


37 


1117 


gi2645890 


Homo sapiens 


IGSF1 


326 


40 


1117 


gi2370143 


Homo sapiens 


immunoglobulin-like domain-containing 
1 


326 


40 


1118 


gi 13810898 


Rattus 
norvegicus 


inhibin binding protein long isoform 


310 


37 


1118 


gi2645890 


Homo sapiens 


IGSF1 


312 


38 


1118 


gi2370143 


Homo sapiens 


immunoglobulin-like domain-containing 
1 


312 


38 


1119 


gi21707128 


Homo sapiens 


Ran binding protein 1 1 


5047 


99 


1119 


gi20987296 


Mus musculus 


Similar to Ran binding protein 1 1 


4898 


96 


1119 


gil7862636 


Drosophila 
melanogaster 


LD41918p 


1191 


38 


1120 


gil8652832 


Homo sapiens 


ASPP1 protein 


5703 


99 


1120 


gil6197705 


Homo sapiens 


ASPP2 protein 


1556 


42 


1120 


gil399805 


Homo sapiens 


Bbp/53BP2 


1556 


42 


1121 


gi 18448478 


Aotus 
trivirgatus 


chorionic gonadotropin beta subunit 


47 


59 


1121 


gi5670272 


Human 
herpesvirus 8 


Kl glycoprotein 


67 


38 


1121 


gi9886851 


Human 
herpesvirus 8 


Kl protein 


63 


36 : 
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1122 


gi2598461 


Homo sapiens 


dJ408N23 1 (sunnression of 
tumorieenicitv 13 (colon carcinoma^ 

ti v« i ^ viii v i vr a* —t \ \j lull vui vlHUH Ju J 

(Hsp70-interacting protein) (Progesterone 
receptor associated P48 protein)) 


1887 
loo / 




1122 


gi904032 


Homo sapiens 


p48 


1860 

lOV/7 


06 
yo 


1122 


gi21218374 


Homo sapiens 


FAM10A5 


1814 


yj 


1123 


gi8927428 


Homo sapiens 


otoraplin 


676 




1123 


gil2619173 


Homo sapiens 


melanoma inhibitory activity like protein 


676 


100 


1123 


gill323317 


Homo sapiens 


dJ705D16 2 (Otoranlin) 


676 


i aa 


1124 


gil2034719 


Mus musculus 


ankyrin-like protein 


467 


HO 


1124 


gil3469729 


Homo sapiens 


breast cancer antigen NY-BR-1 


448 


^A 
jU 


1124 


gi21618588 


Homo sapiens 


testis-soecific ankvrin motif containing 
protein 


**81 


.4.7 


1125 


gil3469729 


Homo sapiens 


breast cancer antigen NY-BR-1 


364 


51 


1125 


gil2034719 


Mus musculus 


ankvrin-Hke nrotein 


^7Q 

j /y 


40 


1125 


gi21618588 


Homo sapiens 


testis-speciflc ankyrin motif containing 
nrotpin 


345 


49 


1126 


gi7770139 


Homo sapiens 


PR01722 


263 


60 


1126 


gi 11493483 


Homo sapiens 


PRO2550 




67 


1126 


gi8572229 


Homo sapiens 


ubiquitous TPR-motif protein Y isoform 


249 


61 


1127 


gi6907090 


Orvza sativa 

(japonica 

cultivar-group) 


oiiuilai lU VJlyiAX Sauva rOOI-SpeClIlC 
RCc^ mRNA (\ 979081 


OO 


35 


1127 


gi5902450 


Cercopithecine 
herpesvirus 1 


Edvcoorotin G 


S8 
JO 




1127 


gi!2750734 


Homo sapiens 


L-tvoe voltaffe-denendent calcium 
channel 


S6 

JO 


48 
*to 


1128 


gil6878260 


Homo sapiens 


Similar to angiotensin 11, type I receptor- 
associated protein 


796 


inn 


1128 


gil6588454 


Homo sapiens 


AGTRAP protein 


705 


95 


1128 


gi9621816 


Homo sapiens 


ATRAP 


705 




1129 


gil7986216 


Homo sapiens 


cell recognition molecule CASPR3 


1864 


98 


1129 


gil2330704 


Mus musculus 


cell recognition molecule PASPR4 ! 


1776 
1 J /u 


71 


1129 


gi21961652 


Mus musculus 


cell recognition protein CASPR4 


1376 


71 


1130 


gil7986216 


Homo sapiens 


cell recognition molecule PA^IPR^ 


6819 
Oo 1 L 


00 

yy 


1130 


gi 1 8390059 


Homo sapiens 


cell recognition nrotein PA^PRd. 




7A 


1130 


gi21961652 


Mus musculus j 


cell recognition protein CASPR4 


4724 


68 


1131 


gi21552969 


Mus musculus 


Williams-Beuren syndrome critical region 
gene 17 


3100 


97 


1131 


gil0336504 


Homo saniens 


IJDP-ffalNAf nnlvnpntiHn \T 

uiyi"uaii>/\t. puiypcpuue I\- 
acet vl Pal a c to oa m i n vl fr nn cf! pra o #» 




ol 


1131 


gil 1041469 


Macaca 
fascicularis 


UDP-GalNAc: polypeptide N- 

avcijigmaLfiUdcUJUnyiuanSIcrase 


1913 


58 


1132 


gil3625176 


Homo sapiens 


throm ho<?nnn H i n 

UU Villi lswo}JUllUlll 


JOO 


4o 


1132 


gil4627121 


Homo sapiens 


dJ824F16.3 (novel protein similar to 
mouse thrombospondin type 1 domain 
protein R-spondin) 


544 


46 


1132 


gi45 19541 


Mus musculus 


thrombospondin type 1 domain 


511 


43 


1133 


gi5305333 


Mus musculus 


protein kinase Myak-S 


865 


50 


1133 


gil8314319 


Mesocricetus 
auratus 


Mx-interacting protein kinase PKM 


865 


50 


1133 


gi5815143 


Mus musculus 


nuclear body associated kinase 2a 


865 


50 


1134 


gil4022292 


Mesorhizobium 
loti 


cell division protein 


45 


36 
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1134 


gil80143 


Homo sapiens 


CD53 glycoprotein 


45 


53 


1134 


gil80141 


Homo sapiens 


cell surface antigen 


45 


53 


1135 


gi 1457 1502 


Homo sapiens 


calcium-promoted Ras inactivator 


4174 


99 


1135 


gi2822157 


Homo sapiens 


similar to GTPase-activating proteins; 
35% similar to JC5047 (PID:g2 136083) 


3961 


99 


1135 


gi41 85294 


Homo sapiens 


rasGAP-activaung-like protein 


1898 


49 


1136 


gill 527987 


Gallus gallus 


immunoglobulin-like receptor CHIR-A 


97 


30 


1136 


gi4322l4 


Human 

immunodeficien 
cy virus type 1 


envelope glycoprotein gpl20 


43 


39 


1136 


gi 15026993 


Homo sapiens 


MUC5AC protem 


64 


38 


1137 


gil5l28l03 


Mus musculus 


nephronectin 


2971 


87 


1137 


gil5430248 


Mus musculus 


nephronectin long isoform 


2640 


80 


1137 


gii604098l 


Mus musculus 


POEM 


2374 


87 


1139 


gi7638247 


Homo sapiens 


mesenchymal stem cell protein DSCD75 


595 


100 


1139 


gil7946258 


Drosophila 
melanogaster 


RE58349p 


165 


34 


1139 


gi2l464462 


Drosophila 
melanogaster 


RH58440p 


158 


36 


1140 


gi21619491 


Homo sapiens 


similar to expressed sequence AW049604 


235 


83 | 


1140 


gi65 72294 


Homo sapiens 


bA262A13.1 (novel protein) 


126 


48 


1140 


gi215692 


Bacteriophage 
P4 


gop protein 


87 


28 


1141 


gi216l9491 


Homo sapiens 


♦ • 1 ■ \ a\ -v t f A A aT% a^ it. A 

similar to expressed sequence AW049604 


454 


82 j 


1141 


gi6572294 


Homo sapiens 


bA262A13.1 (novel protein) 


239 


48 


1141 


gi2 15692 


Bacteriophage 

n a 

F4 


gop protein 


84 


33 


1142 


gl2U30o274 


Homo sapiens 


testicular haploid expressed gene 


1487 


80 


1142 


gi 10443967 


Homo sapiens 


THEG protein 


1487 


80 


1 1 >n 

1142 


gi74 16134 


Homo sapiens 


testis-specific gene 


1487 


80 


1 1 A1 

1143 


gl2 1928259 


Homo sapiens 


seven transmembrane helix receptor 


1023 


100 


1 1 A1 

114 J 


•i Q AQ(\HA*Z 

gllo4oU /4o 


Mus musculus 


olfactory receptor MOR26 1 - 1 0 


864 


84 


1 1 A1 

1143 


'i Oil om/i /i 
gUo4ol)744 


Mus musculus 


olfactory receptor MOR261-9 


858 


82 


1144 


gi21 928655 


Homo sapiens 


seven transmembrane helix receptor 


1458 


93 


1144 


gi 18480746 


Mus musculus 


olfactory receptor MOR261- 10 


1280 


79 


1144 


* 1 Ci A OftT /* A 

gi 18480744 


Mus musculus 


olfactory receptor MOR261-9 


1258 


78 


1145 


gi 1707674 


Streptomyces 
cinnamoneus 


elongation factor G 


52 


34 


1 1 AdT 

1146 


gi 15779092 


Homo sapiens 


Similar to syntaxin 18 


1295 


100 


H46 


gi7707424 


Homo sapiens 


syntaxin 18 


1295 


100 


1146 


gt 1820393 1 


Mus musculus 


Similar to syntaxin 18 


873 


90 


1147 


gi 145733 19 


Homo sapiens 


interleukin-1 HY2 


812 | 


99 


1147 


* 4 n /\/^ ft A A 

gi 18025344 


Homo sapiens 


interleukin-1 receptor antagonist-like 
FIL1 theta 


809 


99 


1147 


gil9068192 


Mus musculus 


IL-1F10 


662 


82 


11/10 


'A i /vol ro 


Mus musculus 


hair keratin acidic 5; Ha5 keratin 


1116 


72 


1148 


gi3724107 


Homo sapiens 


keratin, type I 


1114 


72 


1148 


gil668744 


Homo sapiens 


HHa5 hair keratin type I intermediate 
filament 


1114 


72 


1149 


gil9353375 


Mus musculus 


RIKEN cDNA 1 1 1003 1102 gene 


1417 


84 


1149 


gi6166378 


Mus musculus 


growth suppressor 1L 


141 


30 


1149 


gi 15929776 


Homo sapiens 


growth suppressor 1 


137 


41 


1150 


gil3623421 


Homo sapiens 


Similar to RIKEN cDNA 5730589L02 
gene 


1336 


90 



WO 2004/080148 



PCT/US2003/030720 



177 
TABLE 2 A 



SEQ 
ID 


Hit ID 


Species 




o 
o 

score 


— 

Percentage 
identity 


1150 


gi 19484086 


Mus musculus 


RIKEN cDNA 573 05 8 91 02 ?ene 


10S7 

1Z-0 / 


50 


1150 


gil699265 


Homo saoiens 


malionnnt ppII PYrwpQcirm-AnhanrpH 
gene/tumor oroeressi on-enhanced pene 




^O 


1151 


gil5419605 


Canis familiaris 


masticatory epithelia keratin 2p 


1204 


55 


1151 


gil4595019 


Homo saniens 


If pra tin ^ ire 


1 i / J 




1151 


gi6092075 


Mus musculus 


tvr\P TI PvfriVprafirt 


ill/; 


r i 
51 


1152 


gil 1066090 


Homo sanien<; 


lllaUlA. lUGuulUyl UlCaoC IVJiYLT "2*/ 


1 1SO 


O/C 


1152 


gi 12006364 


Tuoaia 
belangeri 


iiiduiA liician upruLciiid.be- z / 


1101 
1 1/1 


OA 


1152 


gi3511149 


Gallus gallus 


matrix metalloproteinase 


663 


57 


1153 


gil 1066090 


Homo sanien<? 


mpiri Y mpfn11fvnrn<"p?ic£» 1MAAP-07 
1 Hall IA. tllCutliU|JlULC<loC lVJUVli -Zr / 


1 ^QO 
1 JO/ 


O/C 


1153 


gil2006364 


Tupaia 
belangeri 


matrix metalloproteinase-27 


1121 


80 


1153 


gi3511149 


Oallns paling 


iiid.il la. mcLaiioproicinase 


ooJ 


57 


1154 


gi6689894 


Homo sapiens 


Suppressor of Fused 


2599 


100 


1154 


gi5739507 


Homo sapiens 


suppressor of fused 


2594 


99 


1154 




mus muscuius 


Su(fu) protein 


2541 


97 


1155 


gi21667212 


Homo sapiens 


bactericidal/permeability-increasing 
protein-like 2 


2600 


100 


1 


gl^UJO /UQJ 


wncornyncnus 


LBP (Lra binding protem)/BPI 
(bactericidal/permeability-increasing 

UlUlCHl^- L 


690 


31 


1155 


gi20387087 


Oncorhynchus 
mvki^s 


LBP (LPS binding protein)/BPI 

IflCIPt'PriPiriJJ l/T^P'rttl<*ctl^l1 it^/_itir»r*a€ici*-i it 

^uavici i c lUal/ per lUCaU 1 1 1 iy** 1 ncrcaS lug 

protein) like-2 


685 


30 


1156 


gil 1229139 


Homo saniens 


uuuauuj ^ori i^dca uclci mining 
region Whnv 1 8^ 


ZvJOO 


i on 
1UU 


1156 


gil2082687 


Homo sapiens 


Sry-related HMG-box protein 


2066 


100 


1156 


gi8894593 


Homo <?anipnQ 


uUA 1. 0 U1ULC111 


ZtlOO 


1 oo 
lUU 


1157 


gi 19526647 


Homo sapiens 


oxidored-nitro domain-containing protein 


837 


85 


1157 


ri7303522 


UL Ui>Upilll«± 

m el flno<ya ^tpr 


rnnns pa 

L-UIj 1 /o-rA 


172 


31 


1157 


gil 6304788 


A/Ti ] o fniiQPiilitc 


□cnuicss'iiKc uoiquiun conjugating 


oi 




1158 


gi 19526647 


Homo sapiens 


OYlnorPn— TlltTTI dfWlSlin— f»nnf!i in trier r\rr\Toir» 
i/Aiuui\>u itiuv uuiiiaui-cuillallllllg piUlcin 


52^7 
OJ / 


OJ 


1158 


gi7303522 


Drosophila 
melanogaster 


CG13178-PA 


1 70 


^1 


1158 


gil6304788 


Mus musculus 


bendless-like ubiquitin conjugating 
enzvme 


83 


28 


1159 


gil794221 


Mus musculus 


DNA lipase Til-beta 


00X7 


so i 


1159 


gil794223 


Mus musculus 


DNA ligase Ill-alpha 


2987 


89 


1159 


gil9550955 


Homo <janipn<i 


iLgdoc iii, i^iN/\, a ijr -uepenaent 




1 oo 
1UU 


1160 


gil5667919 


Homo <*AT>iprK 

LL\JLLL\J dUpivl lO 


SFHPTNR10 


1 £Ofi 


oo 
yy 


1160 


gil2597188 


Homo sanipnc 


oLjiidiiiuub ecu Carcinoma dnrigcn z 


740 


AQ 

4o 


1160 


gil235617 


Homo sapiens 


squamous cell carcinoma antigen 


749 


48 


1161 


gil5141587 


Eulemur 
rubriventer 


olfactory receptor 


67 


34 


1161 


gi21739229 


Oryza sativa 


OSJNBa0072F16.8 


67 


43 


1161 


gi21629328 


Leishmania 
major 


L3561.8 


65 


37 


1162 


gi2589190 


Homo sapiens 


skin-specific protein 


68 


39 


1162 


_gi38232 


Pan troglodytes 


immunoglobulin alpha heavy chain 


61 


39 


1162 


gil4021730 


Mesorhizobium 
loti 


c-type cytochrome biogenesis protein 


68 


31 
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1163 


gi7228149 


Mus musculus 


ATFa-associated factor 


354 


50 ! 


1163 


gi7303705 


Drosophila 
melanogaster 


CG12340-PA 


193 


24 


1163 


gi5052666 


Drosophila 
melanogaster 


BcDNA.LD26050 


193 


24 


1 1 iC/l 

1164 


gi2090l968 


Caenorhabditis 
elegans 


— . 

C. elegans RPL-36 protein 

(corresponding sequence F37C12.4) 


71 


34 




■en i 1 a c i 

gi5911451 


Drosophila 
nannoptera 


cytochrome oxidase III 


43 


A 1 

41 


1165 


gil3276253 


Homo sapiens 


T-cell receptor beta chain VJ region 


56 


34 


llOJ 


gl39zooyo 


Homo sapiens 


ariz domain protein 1A lsotorm C 


cc 

55 


38 


1166 


gi20381326 


Homo sapiens 


Similar to caspase 8, apoptosis-related 
cysteine protease 


263 


100 


1166 


gu4211398 


Homo sapiens 


caspase-8L 


263 


100 


1166 


gil9401524 


Homo sapiens 


procaspase-8 


223 


95 


1167 


gil0440448 


Homo sapiens 


FLJ00060 protem 


1204 


98 


1167 


gi3983420 


Homo sapiens 


TrTTl f\Y •it , 1 a ( 1 '11 It j 

KIR3DLl-hke natural killer cell receptor 


693 


47 


1167 


gi 13560453 


Homo sapiens 


1 '11 11* 111*1 *1 A 

killer cell lmmunoglobuhn-like receptor 
3DL1 


693 


47 


1 1 HQ 

1168 


gi 1799570 


Rattus 
norvegicus 


TIP 1 20 


4573 


99 


1 1 HO 

1168 


gi/688703 


Homo sapiens 


TIP 120 protein 


4573 


99 


1168 


gi58U583 


Rattus 
norvegicus 


TIP120-family protein TIP120B 


2735 


57 


1169 


gil3016701 


Homo sapiens 


activating coreceptor NKp80 


1226 


100 


1169 


gi7 188567 


Homo sapiens 


lectin-like receptor Fl 


1226 


100 


1169 


gi22449867 


Macaca 

xr : i :_ 

fasciculans 


NKp80 NK receptor 


1122 


90 


1170 


gil4027275 


Mesorhizobium 
ion 


nodulation protein nodG, 3-oxoacyl-(acyl 
carrier protein) reductase 


70 


27 


1170 


gil531618 


Rhizobium sp. 
N33 


NodG 


68 


26 


1170 


gi6899062 


Ureaplasma 
urealyticum 


seryl-tRNA synthetase 


70 


31 


1171 


gi3021409 


Homo sapiens 


transducin (beta) like 1 protein 


3057 


100 


1171 


gil3161069 


Homo sapiens 


transducin beta-like 1 


2548 


91 


1171 


gil 2642596 


Homo sapiens 


nuclear receptor co-repressor/HDAC3 
complex subunit TBLR1 


2431 


86 


1172 


gil362342l 


Homo sapiens 


Similar to RIKEN cDNA 5730589L02 
gene, clone MGC: 1 3 1 24 
IMAGE:4 110925, mRNA, complete cds. 


380 


69 


1172 


gil2803383 


Homo sapiens 


clone MGC:2099 IMAGE: 305 1525, 
mRNA, complete cds. 


376 


68 


1172 


gil3111983 


Homo sapiens 


clone MGC:4221 IMAGE:2958347, 
mRNA, complete cds. 


376 


68 


1173 


gil3623421 


Homo sapiens 


Similar to RIKEN cDNA 5730589L02 
gene, clone MGC: 1 3 1 24 
IMAGE:41 10925, mRNA, complete cds. 


380 


69 


1173 


gil2803383 


Homo sapiens 


clone MGC:2099 IMAGE:3051525, 
mRNA, complete cds. 


376 


68 


1173 


gil3111983 


Homo sapiens 


clone MGCM221 IMAGE:2958347, 
mRNA, complete cds. 


376 


68 


1174 


gil3623421 


Homo sapiens 


Similar to RIKEN cDNA 5730589L02 
gene 


1830 


99 


1174 


gil9484086 


Mus musculus 


RIKEN cDNA 5730589L02 gene 


1802 


95 
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TABLE 2 A 



SEQ 
ID 


Hit ID 


Species 


Description 


S 


Percentage 
j iaentity 


1174 


gil699265 


Homo sapiens 


malignant cell expression-enhanced 
gene/tumor Droffression-enhanced eene 


930 


81 


1175 


gil3182755 


Homo sapiens 


HPHRP 


1210 


100 


1175 


gi 15929309 


Homo sapiens 


ohosohotriesterase related 


IZ1U 


1 C\f\ 

lulf 


1175 


gi881499 


Mus musculus 


parathion hydrolase (phosphotriesterase)- 
related Drotein 


1069 


86 


1176 


gi552075 


Chironomus 
tentans 


giant secretory protein 


71 


28 


1176 


gil5419013 


Toxoplasma 
gondii 


subtilisin-like protein 


71 


34 


1176 


gil56534 


Chironomus 
tentans 


giant secretory protein (gsp) 


66 


28 


1177 


gi5458910 


Pyrococcus 
abyssi 


FLAGELLA-RELATED PROTEIN C 


103 


24 


1177 


gi487272 


Enterococcus 
hirae 


Na+ -ATPase subunit F 


90 


31 


1177 


gi9229886 


Ciona 
intestinal is 


ezrin/radixin/moesin (ERM)-like protein 


111 


27 


1178 


gi21554060 


Arabidopsis 
thaliana 


phytocyanin 


44 


43 


1178 


gi205640 


Rattus 
norvegicus 


acetylcholine receptor alpha subunit 


53 


44 


1178 


gi4028904 


Rattus 
norvegicus 


iiii^uuiii^ auciyicuuiuic receptor aipna *t 
subunit 


J J 


A A 

44 


1179 


gi!8375961 


Neurospora 
crassa 


related tn AROA nrntf*in 


JJ 


A A 

44 


1179 


gi2935025 


Rhodococcus 
opacus 


Drotocatechuate dinxvt*pna^e alnha 
subunit 


JO 


JO 


1179 


gil3421646 


Caulobacter 

crescentus 

CB15 


SDOTJ rRNA methvla<5P familv nrntpin 


jy 


4U 


1180 


gi 14348558 


Homo sapiens 


cDNA encoding protease domain of 
endotheliase 1 


82 


38 


1180 


gil245184 


Danio rerio 


ZgOl 


66 


33 


1180 


gi6137097 


Homo sapiens 


serine t»rotea<?e DF^lPI 


OZ 


JO 


1181 


gil9528151 


Drosophila 
melanogaster 


AT26759p 


59 


35 


1181 


gil6768554 


Drosophila 
melanogaster 


GM08606p 


59 


35 


1181 


gi7291750 


Drosophila 
melanogaster 


CG4065-PA 


59 


35 


1182 


gil3377880 


Cricetulus 
lonsicaudatus 


arginine N-methyltransferase p82 isoform 


3253 


85 


1182 


gi 13377882 


Cricetulus 
iongicaudatus 


ai^uiiiic iN~iiicuiyiuanbierase p// lSOIorm 




or 
OJ 


1182 


gi21626587 


Drosonhila 
melanogaster 


CG9882-PA 


1 O 1 1 


36 


1183 


gil91185 


Cricetulus 
griseus 


phosphatidylserine decarboxylase 


1130 


88 


1183 


gi5921491 


Homo sapiens 


dJ858B16.2 (phosphatidylserine 
decarboxylase (PSSC, EC 4. 1. 1.65)) 


1220 


96 


1183 


gil6306618 


Homo sapiens 


phosphatidylserine decarboxylase 


1220 


96 


1184 


gil 1907580 


Mus musculus 


TSC22-related inducible leucine zipper 
3c 


894 


87 


1184 


gi5231131 


Homo sapiens 


TSC-22 related protein 


460 


98 
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ID 


ITS* TT\ 

Hit JLU 


Species 


Description 


o 
o 

score 


Percentage 
identity 


1 1 Rd. 
i 154 


mCQI 01 £1 

gu>y iy ioi 


nomo sapiens 


i ov^-zz-uKe r rotem 


40U 


oo 
9o 


1 1 rs 

i lOJ 


ml 187AA17 
gl 1 JO /44j / 


nouio sapiens 


cereorai protein- 1 1 


1401 


OO 


1185 


gil5292367 


Drosophila 
melanogaster 


LD47668p 


510 


41 


1185 


gi2443444 


Homo sapiens 


TEX28 


310 


40 


I lot) 


gll3543y4U 


Homo sapiens 


similar to RIKEN cDNA 2610017G09 
gene 


2568 


99 


1 loO 


r« 1 Q7fl/l^7ft 

gl 1oZU4j ZU 


— r— 

Mus musculus 


KilsJiJN cDNA zolOUl /U0y gene 


ziol 


A1 

91 


1 1 Q£ 
1 160 


giioyzjjji 


Homo sapiens 


Kobr-35 


1 A1A 

1434 


99 


1 1 87 


glloo/OOOU 


Homo sapiens 


T?T TAAOOA ^-^a^' 

rLJUUzzy protein 


931 


A1 

91 


1 1 C7 
1 10 / 


gI0oZ4/ 1 1 


Caenorhabditis 
eiegans 


similar to 7TM chemoreceptor (srd- 
iamnyj 


OA 

oU 


OA 

20 


1 1 87 
110/ 


friR87<A77 

glooZDOZZ 


Kauus 
norvegicus 


T cell receptor 


/CO 

oo 


36 


1 1RR 

i lOO 


01*1786*111 1 
1 / OOjj 1 1 


noino Sapiens 


oipeptiayi peptidase- nice protein y 


4040 


1 AA 
1UU 


1 1 RR 
1 1 oo 


oil i no*; 188 

1 1U7J 1O0 


nomo sapiens 


dipeptidyl peptidase 8 


787^ 
Zo/O 


ou 


1 1 oo 


oi7 1 7 AS 1 11 


nomo sapiens 


Similar to dipeptidylpeptidase 8 


771 7 
ZZ1 / 


JO 


1 1 BO 
1 107 


oi17RfiS11 1 

gl l / OOJJl I 


nouio bdpiens 


uipepiiayi pepuaase-UKe protein y 


AC\f*Q 

4uoy 


1 HA 
1UU 


1189 


gii 1095 188 


Homo sapiens 


dipeptidyl peptidase 8 


2454 


59 


1 1RQ 
i ioy 


glZ IZOJ lJj 


nomo sapiens 


Similar to dipeptidylpeptidase 8 


OA« 
Z4jj 


DO 


1190 


gil7865311 


Homo sapiens 


dipeptidyl peptidase-like protein 9 


4542 


98 


1 1Qfl 

i iyu 


fri 1 1 HQS 1 88 

gii iuyj loo 


Homo sapiens 


dipeptidyl peptidase 8 


TO 1 A 

ZolU 


<A 
OU 


i iyu 


r«71 7£<\1 11 
glZ 1ZOJ 1 J J 


Homo sapiens 


Similar to dipeptidylpeptidase 8 


OKI 

2151 


57 


1191 


gi337508 


Homo sapiens 


ribosomal protein 


554 


99 


1 1 A1 

i iyi 


gl j / /Z4 


Kattus rattus 


ribosoma! protein S25 


554 


99 


1 1 01 
1 ly 1 


ml 70A^7O 

gllZo<J5Z5i 


Mus musculus 


ribosomal protein S25 


CCA 

554 


99 


1 ly/ 


glZUol /O 


synthetic 
construct 


Dz-T antigen 


61 


40 


1 1 

i iyj 


cTi7178<\81 
gl /OZOJOO 


Drosophila 
mei anogaster 


; 

mechanosensory transduction channel 

INC/lVlrU 


OC 1 

o51 


ZO 


1 101 


cn'718^1 11 
gl / JO J 1 1 J 


dos taurus 


ankyrin 1 


777 
ill 


1A 
3U 


1193 


gii 1065673 


Caenorhabditis 
eiegans 


Y71A12B.4 


778 


28 


1194 


gi7672669 


Homo sapiens 


serine protease Htra2 


1890 


100 


1 1 QA 

i iy4 


ml 76^7 AOS 


Homo sapiens 


HtrA-like serine protease 


loyU 


1 AA 
1UU 


1 1 QA 
1 ly4 


gOo/UoOD 


Homo sapiens 


serine protease 


1 OAA 

loyO 


1 AA 
100 


i iyo 


gi34y44y 


Homo sapiens 


A3 adenosine receptor 


904 


1 AA 
100 


1 1 OS 

nyj 


gii jjjyuo4 


Homo sapiens 


bA552Ml 1.6 (adenosine A3 receptor) 


yu4 


1 AA 

1U0 


1195 


gi20988265 


Homo sapiens 


adenosine A3 receptor 


904 


100 


1 1 G< 

1 lyo 


glZ104">Ziy 


Drosophila 
melanogaster 


tAjl5o71-rA 


299 


37 


1 1 c\c 

1 196 


gi9864185 


Drosophila 
melanogaster 


Crossveinless 2 


AAA 

299 


37 


1196 


gi7768636 


Xenopus laevis 


Kielin 


276 


34 


1197 


gii 8480772 


Mus musculus 


olfactory receptor MOR101-2 


1415 


84 


i iy / 


giio4/yj4o 


Mus musculus 


oitactory receptor MUKlul- 1 


1 11 A 

1334 


OO 

oz 


1197 


gi3769616 


Rattus 
norvegicus 


olfactory receptor 


973 


86 


1198 


gi498768 


Serratia 
marcescens 


Deoxyadenosyl-methyltransferase 


339 


51 


1198 


gil0799034 


Vibrio cholerae 


DNA adenine methyl ase 


332 


54 


1198 


gii 0799036 


Yersinia 

pseudotubercul 

osis 


DNA adenine methyl ase 


331 


52 
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TABLE 2 A 



SEQ 
ID 
1199 
1199 


Hitm 

gil6974751 
gi666121 


I Species 

Gallus gallus 
Xenopus laevis 


Description 

CALII 
cpl-1 


S 

score 
338 


Percentage 
identity 

37 


1199 
1200 


gi 12 13589 
gi22296200 


Xenopus laevis 
Thermosynecho 
coccus 

elongatus BP-1 


Prostaglandin D Synthase 
asparaginyl-tRNA synthetase 


293 
292 
1057 


33 
33 
46 


1200 


gil7132791 


Nostoc sp. PCC 
7120 


asparaginyl-tRNA synthetase 


1027 


46 j 


1200 


gil9713460 


Fusobacterium 
nucleatum 
subsp. 
nucleatum 
ATCC 25586 


Asparaginyl-tRNA synthetase 


1013 


43 


1201 


ml Qf\QQt\~7fl 

giioUooy/l) 
gi20067381 


Homo sapiens 
Homo sapiens 


Similar to RIKEN cDNA 4933400E14 
gene 

ALMS1 protein 


1263 
249 


99 

41 I 


LZVl 

1202 


giz 1552774 
gi347134 


Mus musculus 
Homo sapiens 


Almstrom syndrome 1 protein 
succinate dehydrogenase flavoprotein 
subunit 


219 
495 


38 
92 


1202 
1202 


gil2655061 
gi506338 


Homo sapiens 
Homo sapiens 


succinate dehydrogenase complex, 
subunit A, flavoprotein (Fp) 


495 


92 


1203 
1203 


gi 18490322 
gi21928186 


Homo sapiens 
Mus musculus 


flavoprotein subunit of complex II 
Similar to RIKEN cDNA 6330404MI 8 
gene 


495 
2241 


92 i 
99 1 


1203 
1204 


gi 17946082 . 
gi9957161 


Drosophila 
melanogaster 
Homo sapiens 
Mus musculus 


GPl-gamma 4; GPIgamma4 
RE54096p 

alphaCP-3 


1471 
688 

1722 


61 
47 

100 


1204 
1205 


gil5082311 
gil4574118 


Homo sapiens 

Caenorhabditis 

elegans 


alphaCP-3 

Similar to po!y(rC)-binding protein 3 
C. elegans DPY-19 protein 
(corresponding sequence F2^B7 10) 


1708 

840 

239 


99 
99 
31 


1205 

1205 
1206 


gil2328595 

gi 18378695 
gi 189760 


Heterodoxus 
macropus 
Bufo maculatus 
Homo sapiens 


NADH dehydrogenase subunit 2 
NADH dehydrogenase subunit 2 


79 


29 


1206 
1206 

1207 


gi 189762 
gil90792 j 

gi688292 


Homo sapiens 
Homo sapiens 

Homo sapiens 


pyruvate dehydrogenase beta-subunit 
pyruvate dehydrogenase El -beta subunit 

DVTUVate dehvrfrftcrpnaci* P1-hf»+a ciikurtit 
r J 1 " vi^iijr ui ugciiaoC OI~UCld. SUOUUll 

precursor 

calmitine; calsequestrine 


1710 
1710 

I III) 

2029 


_96 

96 

96 1 
100 I 


1207 
1207 


gi26!8621 
gi 164842 


Mus musculus 

Oryctolagus 

cuniculus 


skeletal muscle calsequestrin 


1938 
1908 


94 j 
94 J 


1208 


gi22295775 


Thermosynecho 
coccus 

elongatus BP-1 


periplasmic sugar-binding protein of 
sugar ABC transporter 


65 


35 | 


1208 


gi2622963 


Methanothermo 
bacter 

thermautotrophi 
cus str. Delta H 


conserved protein 


59 


30 


1208 

1209 
1209 
1209 


gil8377999 

gi 11034760 
gi 10432376 
gill 022733 


Drysdalia 
coronata 
Homo sapiens 
Homo sapiens 
VIus musculus 


NADH dehydrogenase subunit 1 
NIB AN 

bG56G5.1 (novel protein) 
Niban 


61 

3692 
3334 
2320 


34 

99 J 
99 ! 
67 ! 
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SEQ 
ID 

1210 
1210 
1210 


> Hit ID 

gi2982508 
gi3002925 
gi36733 


Species 

Homo <iflnif*nc 

Homo sapiens 
Homo sapiens 


Description 

i lk oeta chain 

T cell receptor beta chain 


S 

score 
1292 
1281 


Percentage 
identity 

93 
93 


1211 
1211 
1211 

1213 
1213 


gil2006041 
ei!4 189960 

fa* * » * \}J J \J\J 

gil9072857 

gi2995719 
ei20072790 


Homo sapiens 
numo Sapiens 
Homo sapiens 

Homo sapiens 


T cell antigen receptor beta chain 

AD038 

FKOQ764 

lung squamous cell cancer related protein 
LSCC-3 

protocadherin 43 


1028 
761 
141 
129 

4792 


75 
98 

53 J 
60 | 

100 \ 


1213 
1214 

1214 


gi5456977 
gi337487 

gil79882 


numo Sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


_ protocadherin gamma subfamily C, 3 
protocadherin gamma C3 
Ro ribonucleoprotein autoantigen (Ro/SS- 
A) precursor 
calreticulin 


4777 
4777 
1747 


99 ] 
99 J 
99 


1214 

1215 
1215 


gi22203354 

" gi200964 
gi3228237 


Cricetulus 
griseus 

Mus musculus 
Homo sapiens 


calreticulin 

serine 2 ultra high sulfur protein 
ultra high sulfer keratin 


1747 
1687 

319 


99 _J 
95 

52 


1215 
1216 

1217 
1217 
1217 


gi200962 
gi 13940422 

gi5917716 
gi 14275701 
gi2738577 


Mus musculus 
Macaca 
sylvanus 
Gallus gallus 
Influenza virus 
Homo sapiens . 


serine 1 ultra high sulfur protein 
ATPase subunit 8 

sprouty 2 
matrix protein 2 
connexin46.6 


281 
281 
56 

60 
62 
54 


49 ] 
50 

31 . | 
45 ] 


1218 
1218 
1218 
1219 


gi 17223709 
gil7223711 
gi7380925 
gil5025778 


Homo sapiens 
Mus musculus 
Bos taurus 
Clostridium 
acetobutylicum 


selenoprotein SelM 
selenoprotein SelM 
Fc gamma receptor III 
Predicted membrane protein 


188 

/ J 

50 


50 \ 
1UU J 

78 ] 

4D f 

36 J 


1219 


gi 13752743 


Serratia 
mdrcescens 


TrpG 


65 


51 j 


1219 


gi20906991 


Methanosarcina 
mazei Goel 


Cation transporter 


62 


29 j 


1220 


gi535358 


Neisseria 
gonorrhoeae 


Opal5063G 


(SO 




1220 

1221 
1221 


gi 1480793 

gi992950 
gil89l51 


Neisseria 
meningitidis 
Homo sapiens 
Homo sapiens 


Opall 
OPN-c 


58 
1426 


47 

98 | 


1221 
1223 


gil001963 
gil8088363 


Homo sapiens 
Homo sapiens 


nephropontin precursor 
osteopontin 

advanced glycosylation end product- 
specific receptor 


1377 

1177 

2004 


90 | 
99 


1223 
1223 


gil841550 
gi6691626 


Homo sapiens 
Homo sapiens 


receptor for advanced glycosylation end 
products 

advanced glycation endproducts receptor 


2004 
2004 


99 | 
99 


1224 
1224 


gi3 157464 
gi8778370 


Thermus sp. A4 

Arabidopsis 

thaliana 


integral membrane protein 
F1504.23 


77 
65 


38 2J 
37 


1224 
1225 j 


gil5 156782 
gi37231 


Agrobacterium 
tumefaciens str. 
C58 (Cereon) 
Homo sapiens 


AGR_C_3106p 
DNA topoisomerase II 


59 
8061 


34 i 
99, 1 


1225 j 
1225 | 


?i3869382 
5i790988 ( 
1 


Homo sapiens 
Cricetulus ] 
ongicaudatus 


DNA topoisomerase II beta 

DNA topoisomerase (ATP-hydrolysing) 


8048 
7892 


99 

97 1 
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SEQ 

ID 


Hit ID 


Species 


Description 


S 1 
score 


Percentage 
identity 


1226 


gil0041309 


Homo sapiens 


hFATPl 


3336 


98 


1226 


gu88l713 


Kattus 
norvegicus 


faftv flriH tran^nnrt nrotein 


3031 


87 




• i r\r\A i a AT 

guU0413v/ 


KgttUS SO, 


rFATP 


3031 


87 J 


122 / 


gliJOyl /O 


lvius rnuscuius 


("DP9 comnlex subunit 7b 


796 


94 


1227 


gil5215085 


Mus musculus 


Similar to COP9 (constitutive 
nhntnmorohoffenic^ subunit 7b 
(Arabidopsis) 


793 


93 






riunio bdpiciib 


DFRP10 (dermal oaDilla derived protein 
10) 


467 


56 


1228 


gi6942096 


Mus musculus 


CBLN3 


938 


93 


1 Tl O 

1228 


nil QAO^ 1 


nOuio sapient* 


nrpr"prpV»p1lin i 


551 


58 


1 no 

1228 


gu I\)15 1 1 


rvius muscuiub 


rwppprprkpl 1 1 n- 1 


544 


57 


1229 


gil7861952 


Drosophila 
meianogdbier 


LD01947p 


1384 


50 


1229 


gi6850946 


Homo sapiens 


dJ322I12.l (novel protein similar to C. 
He pans C05C8 6 <Tr0l6313Y) 


336 


100 


1229 


gi21411108 


Mus musculus 


Similar to BTB domain protein BDPL 


211 


32 


1230 


gi8132557 


Drosophila 
melanogaster 


ankyrin 2 


729 


30 


1230 


gi710551 


Mus musculus 


ankyrin 3 


734 


29 


1230 


gil841966 


Rattus 
norvegicus 


ankyrin 


700 


30 


1231 


gi2t667212 


Homo sapiens 


bactericidal/permeability-increasing 

nrr^f-^iri-li Vp 0 

piUlCUl~llft.C j 


2384 


98 


1231' 


gi20387085 


Oncorhynchus 
mykiss 


LBP (LPS binding protein)/BPI 
(bactericidal/permeability-increasing 
protein ) i 


672 


31 


1231 


gi20387087 


Oncorhynchus 
mykiss 


LBP (LPS binding protein)/BPl 
(bactericidal/permeability-increasing 

pruicinj lllVC-Z, 


667 


30 


1232 


gi21667212 


Homo sapiens 


bactericidal/permeability-increasing 

rMV%f"Pin_ lil^P "7 
piUlvIU llltv £* 


2389 


99 


1232 


gi20387085 


Oncorhynchus 
mykiss 


LBP (LPS binding protein)/BPI 
(bactericidal/permeability-increasing 
pruicin j i 


664 


31 


1232 


gi20387087 


Oncorhynchus 
mykiss 


LBP (LPS binding protein)/BPI 
(bactericidal/permeability-increasing 
proicmj nttc-^ 


659 


30 


1233 


gi21667212 


Homo sapiens 


bactericidal/permeability-increasing 
protein-like 2 


2595 


99 * 


1233 


gi20387085 


Oncorhynchus 
mykiss 


LBP (LPS binding protein)/BPI 

i Daciericiuai/pcriued.oiiiLy"iiivicaoiii^ 

protein)- 1 


698 


31 


1233 


•OA1 07A07 

gl203o7Uo / 


One orhynchus 
mykiss 


a pfpr i r i rl« 1/nprm pah il i tv- increas in ff 

1 UdvLCl 1 Ul iial/ ptl lllwauiiitjr Jii^i vetoing 

protein) like-2 


693 


30 


1234 


gil9569876 


Dictyostelium 
discoideum 


SIMILAR TO HYPOTHETICAL 26.2 
KD PROTEIN 


247 


26 


1234 


gi2191168 


Arabidopsis 
thaliana 


contains similarity to myosin heavy chain 


187 


27 


1234 


gi603379 


Saccharomyces 
cerevisiae 


Yerl39cp 


145 


28 


1235 


gil 1493528 


Homo sapiens 


PR01953 


671 


100 


1235 


gil9912632 


Eulemur 


MHC class II antigen 


56 


33 
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SEQ 
ID 


Hit ID 


Species 


u esc rip no n 


c 

atui c 


irlpnf itv 

lUCllltljr 






— r~~ ~ 

rubn venter 








1 T5 C 

1235 


1 AA1 O/COA 

gl 19912630 


Eulemur 
macaco macaco 


iviriv^ ciass 11 antigen 






Izio 


gll /Uoj9M 


Ostcrtagia 
Uoicruigi { 


r* r\ 1 1 o iron 

collagen 


70 


35 


1 97£ 


oil 5Rft77 
gll DoVt I 


T"^rne/"\r>nf 1 £ 
Ul UbUpillla 

robusta 


nf>n aH nrAtein 


69 


38 


1 0O.A 




vjiyuiuc iiioA 


HpVivHrin-liVf nrotpin 


81 


27 


1237 


gi3068592 


Mus musculus 


punc 


2396 


94 | 


i on 
1/3 / 


glliO /U3VO 


riomo sapiens 


1LL/J-/1VIJO 


890 


41 


1237 


gill862941 


Mus musculus 


DDM36E 


892 


41 


1238 


gi 12667401 


Homo sapiens 


INUrZlv 






1238 


gi 143 17902 


Homo sapiens 


Kinetocnore protein inuiz 


9147 


99 


1238 


gll 2667403 


Mus musculus 


INUrZK 


17S4 


71 


1239 


gi2494126 


Arabidopsis 
thaliana 


contains similarity to Laiiamyaia outer 
membrane protein (gb|X53512). 


04 


91 


1239 


gll 98 87475 


Methanopyrus 
kandleri AV19 


uncnaractenzea protein conberveu in 
archaea 


I/O 


14 


1 1*3 A 

1239 


gl21646173 


Chlorobium 

tantrum TT Q 

tepioum i l.o 


rioosomai protein ozu 


67 


29 


1240 


gi21634825 


Homo sapiens 


semaphorin 6D isoform 4 


5658 


98 


1240 


gi2 1634823 


Homo sapiens 


semaphorin 6D isoform 3 


1106 


96 


1240 


gi21634827 


Homo sapiens 


semaphorin 6D isoform 1 


3106 


99 


1241 


gi9949555 


Pseudomonas 
aeruginosa 


probable pyruvate dehydrogenase El 
component, alpha subunit 


71 


35 


1241 


gi48708 


Mycobacterium 
tuberculosis 


ORFal (AA1-74) 


58 


37 


1241 


gi307352 


Homo sapiens 


prothymosin alpha 




If. 


1242 


gi9l0633l 


•vr i 1 1 — 

Xylella 
fastidiosa 9a5c 


3-dehydroquinate synthase 






1242 


gil3700302 


Staphylococcus 
aureus subsp. 
aureus N3 1 5 


xanthine phosphoribosyltransferase 


45 


35 


1242 


TO i OOO CIA 

gi2 1203529 


Staphylococcus 
aureus subsp. 
aureus MW2 


Arnnfkino -*%V* striven An n Ad 1 / I fr^ n CTArO C> O 

xantmne pnospnonoosyitransierase 


4S 


IS 


1243 


♦O 1 /C*71 1 AC 


Homo sapiens 




1134 


100 


1243 


gi20070921 


Mus musculus 


RIKEN cDNA 2410008M22 gene 


829 


74 


1243 


gi2 1594785 


Homo sapiens 


oimuar to iviisjirN cl/in/v zhiv/uuoivlzz 
gene 


5.72 
j /a. 


97 


1244 


" 111 O-l 

gi601338l 


Rattus 
norvegicus 


IMorl 


147 
it / 


47 


1244 


gil9353944 


Mus musculus 


Ti Tt/TJVr *I~\XT A I^IAllOni Q nana 

RIKEN cDN A zo 1 1)3 1 oO 1 o gene 


107 

1-6/ 


11 
j l 


1244 


gi20270909 


Oncorhynchus 
mykiss 


VHSV-induced protein-6 


118 
l lo 


71 
j l 


1245 


gi6013381 


Rattus 

n nrv/* oi c 1 1 q 


IMorl 


979 
Z /Z 


16 

JO 


1245 


gi21428644 


Drosophila 
meianogaster 


LP10820p 


256 


42 


1245 


gi20270909 


Oncorhynchus 
mykiss 


VHSV-induced protein-6 


190 


29 


1246 


gil 1993700 


Homo sapiens 


melastatin 2 


1194 


100 


1246 


gi3243075 


Homo sapiens 


melastatin 1 


1057 


83 


1246 


gi3047242 


Mus musculus 


melastatin 


1050 


83 


1247 


gil 8044366 


Homo sapiens 


Similar to MEGF10 protein 


3468 


99 
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1247 


oil 7186053 


Mii<» miiSfMilns 


Jedi nrotein 


2280 


51 


1247 


oi*i8252fi58 


Mns miiscillus 

iVl.Uo IIIUjvUIUO 


Jedi-736 protein 


2280 


51 




P i2fl087880 

glZ.U70 / OOv 


N/fuc miiQfiiliiQ 

lYlUo lllUo^UlUo 


Similar to PTH-resnonsive osteosarcoma 

VJ 111111(11 lv X JL A A. 1 WuUUllOl I U UdlWulU vWIIIU 

Bl nrotein 


3586 


87 


1548 


oi4588087 

git J OOV/O / 


HnniA canipns 

ilUillVJ oa|Jlvil9 


PTH-resnonsive osteosarcoma Bl orotein 


2264 


92 


1248 


pi21 59571 1 

gi^> i j y j /ii 


Homo saniens 


Similar to PTH-resoonsive osteosarcoma 
Bl protein 


1546 


100 


1249 


pil9913471 


Homo s aniens 


similar to dJ84N20 1 1 ( novel orotein. 
isoform 1) j 


1265 


99 


1249 


eil3591434 


Homo s aniens 


dJ84N20.L2 (novel protein, isoform 2) 


1160 


100 


1249 


gil3591435 


Homo sapiens 


dJ84N20.Ll (novel protein, isoform 1) 


976 


99 


1250 


eil6605581 


Homo saniens 


H-revl07-like protein 5 


1451 


100 


1250 


pi2 1707989 


Homo saniens 


Similar to H-revl07-Iike protein 5 


1376 


96 


1250 


pi6048565 


Homo saniens 


retinoid inducible pene 1 

A. villi V* V* lllUUVlUIV gvllw X 


382 


54 


1251 


pi2 1263094 


Rattus 
norvegicus 


tramdorin 1 


1667 


81 


1251 


pi2 1263092 


Mus musculus 


tramdorin 1 

UulUUvlIll L. 


1664 


82 


1251 


pi? 1 908026 


Mus musrulus 


nroton/amino acid transnorter 2 


1664 


82 


1252 


gil4571904 


Rattus 

norvepirus 


lysosomal amino acid transporter 1 


1690 


87 


1252 


gi21908024 


Mus musculus 


proton/amino acid transporter 1 


1685 


87 


195? 


oi? 1 76309? 


K^iic mncfiiliiQ 

1Y1UD lllUovUlUO 


tramdnrin 1 


1294 


66 


1253 


gi21595630 


Homo sapiens 


Similar to forkhead box L2 


75 


44 


LL J J 


0110580560 


HQ 1 f\ T\G tf*t Al"! 1 1 TY1 

sp. NRC-1 


tranc 1**cir>n r<*nair* YntH 
Llallo IColUll IwJJoIl, I *-JJ 1 1 


69 


51 




oi557673 


Quo cr»mf"ji 


RMRK antippn 

DIVIOO oJlllgdl 


12 


41 


1254 


gil669500 


Mus musculus 


fibroblast growth factor homologous 

Idl/lUi I 


917 


90 


1254 


gil563885 


Homo sapiens 


fibroblast growth factor homologous 

XdClOt 1 


917 


90 




gllHO 1 /SO 1 


Raff no 

norvegicus 


iiorouiast growm lauiur nuuiuiuguuo 
factor IB 


916 


98 


1 

IX J J 




nUIIlU bdpicilb 


Similar to RfKFNT pDNA 1700010H15 


779 


100 




pi 19763005 


■ Ciona 
intestinalis 


Ipnrine-rich reneat dvnein lipht chain 


759 


75 


1255 


oi276016l 


AntVinridariQ 

crassi spina 


outer arm dvnein lipht chain 2 


656 


68 


1256 


ml 2666529 


Mus musculus 


b,b-carotene-9', 1 0'-di oxygenase 


2356 


80 


1256 


gi4001821 


Ambystoma 
tiprinum 


RPE65 protein; retinal pigment 
eoithelium 65 -orotein 


1125 


44 


1256 


rill 990268 


Mus musculus 


beta beta-carotene 15 15*-dioxv2enase 


1110 


42 


1257 


pi 12666529 


Mus musculus 


h b-rarotene-9 1 10 f -dioxvpenase 


2305 


81 


1257 


pi4001821 


AtnVivQtnma 

tigrinum 


RPF6S nrotein* retinal nipment 

XVI bUJ |^1\/IV111, IVUUttl ^/IgllLVlIL 

epithelium 65-protein 


1122 


44 


1257 


gi 11990268 


Mus musculus 


beta,beta-carotene 15,15 '-di oxygenase 


1113 


42 


1258 


gi 18490501 


Mus musculus 


RIKEN cDNA 2010002A20 gene 


868 


76 


1258 


gi61 


Bos taurus 


calmodulin-independent adenylate 
cyclase 


166 i 


29 


1258 


gi 15559697 


Homo sapiens 


Similar to neural cell adhesion molecule 1 


165 


29 


1259 


gj21748488 


Homo sapiens 


FU00277 protein 


50 


52 


1259 


gi2331293 


Mus musculus 


preprocortistatin 


73 


40 


1259 


gil335910 


Rattus 
norvegicus 


preprocortistatin 


58 


36 
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c 
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Pprppn til tx p 

A CI IvlIlAgV 

■dentitv 


1260 


gil079734 


Mus musculus 


citron 


1291 


94 


1260 




Mus musculus 


t , Vi/"»/ra/»_ir»t pi-op finer pitron in n^ CP 
rnO/iaC-'intCIaHUlg V/lll Oil Mim&t 


1286 


94 


1260 


gi2745840 


Rattus 
norvegicus 


postsynaptic density protein; citron 


1262 


93 


1261 


gll4/l3UZy 


Mus musculus 


corina ( rxr r»\/of^»i"n/^^ rHY\rpinncp inVltKitAT* 

serine 101 oyaiciiic^ piutcniaoc uuul/iiui, 
plsirlp P fnpvin nlaQminnp'en activator 

LluUC d ^llCAlllj ^JiewillilUJgtii owuvaiui 

inhihitnr tvnp 1 ^ rrtP.Tr! Her 1 
uiiiiuiiui iyp^ i /) iiitiiiiL^i 


407 






glJJ IUOj 


Anno *Y"»nc^*iiliic 


r\v cA pa c p— ti p y i n 1 


406 


38 


1261 


gl41ZlO / 


riomo sapiens 


cA\ci-Af*r\\Tf*A npiiritp-nrnmntino' fartnr 

glla-UCll VCu llwlU llcpHJlIlv»Ull^ lttWUl 

CGdNPFi 


397 


38 


17£7 
1ZOZ 


gMOZO J 0 1 


T-Tnmri canipfiQ 
riuiiiu oa^iciio 


<;pnp«rpncp-associated enithelial 

OvILwuvwUvw OtWVA/lHVvU V|/inivin*i 

membrane protein 


223 


97 


1 7£7 
1ZOZ 


gll jZl*tO /O 


Hnmn canipnc 
nuiUU oaJJlCllo 


claiiHin 1 

V/ldUulll 1 


223 


97 | 


1 7A7 
LZOZ 


gl/ JO IvO J 


ITU111U aafJlCilo 




223 


97 


1 7/C1 
1Z03 


ail 1 ^ 444 <i 
gIZ 1 OO t * t rrJ 


flOUlU bd.pi CI Lb 


rtTP-hinHinfr nrotein Sara 


449 


57 


1 7A1 
IZOj 


gl 1 J J*tZOOJ 


NAiic miiCfMiliiC 
i VI Lib lllUoUUlUb 


SARI nrotein 


446 


54 | 


IZOj 


<riRQ7£7fK 
glO^ZOZUJ 


f-Tnmrk canipnc 
nuinu oapiciib 


SARI 


445 


54 


1 OA A 


ml 1 ^^C7fiA 
gl 1 1 JJOZOH 


ilUlllU oaplCllo 


Qnhinp'ficinp- 1 -nhn<;phatase 

oUiiiiiKUdlil^ l L/nvyo i^iiouctow 


697 


37 


1Z04 


tn 1 ^447 1 Q0 


riOIIlU balJICllb 


^nh i n oris i n p- 1 -nh o^nfi ate oh osnh atase 


683 


37 


1264 


gi9623190 


Mus musculus 


sphingosine- 1 -phosphate 

nhncnfi ftfi vHrol ase 


691 


38 


1 7£5 

IZOJ 


ml A 


Dnc tannic 
DUo U1U1 Uo 




1026 


37 


1265 


gi5 107945 


Homo sapiens 


CD 163 


1093 


40 


IZOJ 


171 A7 
glO 1Z 14Z 


nomo sdpicnb 


lvii ju aiiugcii 


1093 


40 


1 7££ 
1ZOO 




13 +Q1ITI1C 


RnWn 1 

D\J VY 1 • 1 


1026 


37 


1 7££ 
1ZOO 


gij iu 


JlTOITIO bdpiCllb 




1093 


40 


1266 


gi3 12142 


Homo sapiens 


M130 antigen 


1093 


40 


lZO/ 


giioo/3 /UU 


iNecaior 
americanus 


XT A r^H H#»K\/Hrncrpn5icp ciihiinit 7 
lNrVi-Jrl UCliyUlUgCUaoC aUUUllll> ^ 


69 


32 


1267 


gi20338417 


Gallus gallus 


potassium channel subunit 


57 


31 


1267 


gi396416 


Escherichia coli 


similar to Neurospora crassa phosphate- 
repressiDie pnobpnaic pcinicaoc 


72 


42 


1268 


gi216l949i 


Homo sapiens 


similar to expressed sequence AW049604 


778 


100 


1268 


gi65 72294 


Homo sapiens 


DAzoZAij.i ^novei proieinj 


7^1 


49 


1268 


gil61662 


Tribolium 
castaneum 


zinc finger protein 


60 


26 


1269 


gi21591552 


Haemophilus 
influenzae 
biotype 
aegyptius 


— n c\ '• Ti • 

cell filamentation-like protein 




^1 i 


lzoy 


gll /OZ/ / 1 


rieuroaeies 
waltl 


nomeouoniain-t/Uiuaiiiiiig pi ulciii 


66 


35 


1269 


guiOZoZJ,} 


Drosophila 
melanogaster 


Vjrn 10 JZ /p 


5^ 
j j 


41 


1270 


gi!8033185 


Danio rerio 


UNU4j-reiateo protein 


J L\JJ 


7^ 




oi1774R7S7 


T-Tnmn cm"HPn<» 
i luiiiu oajjitaio 


SMAP-1 


2393 


57 


1270 


gil2248771 


Homo sapiens 


SMAP-lb 


2393 


57 


1271 


gi21064657 


Drosophila 
melanogaster 


RH01479p 


185 


39 


1271 


gi7304173 


Drosophila 
melanogaster 


CG1577-PA 


185 


39 


1271 


gi20150011 


Pseudomonas 
fluorescens 


MmplV 


89 


36 


1272 


gi9366656 


Trypanosoma 


probable similar to ring-h2 finger protein 


76 


55 
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brucei 


rhala. 






1272 


gi6714271 


Arabidopsis 
thaliana 


F6N18.7 


59 


36 


1272 


gil0440424 


Homo sapiens 


FLJ00047 protein 


74 


50 


1273 


gil5823642 


Homo sapiens 


ALS2CR7 


2038 


100 


1273 


gi2645810 


Mus musculus 


Pftaire-1 


1195 


68 


1273 


gi2392814 


Mus musculus 


PFTAIRE kinase 


1190 


67 


1274 


gi2407911 


Homo sapiens 


differentially expressed in Fanconi 
anemia 


714 


96 


1274 


gi21595389 


Homo sapiens 


similar to FYVE finger-containing 
phosphoinositide kinase (1- 
phosphatidylinositol-4-phosphate kinase) 
(PIP5K) (PtdIns(4)P-5-kinase) (p235) 


89 


27 


1274 


gi330134 


human 
herpesvirus 1 


latency-related protein 1 


87 


46 


1275 


gi21908028 


Homo sapiens 


a disintegrin and metalloprotease domain 
33 


4205 


97 


1275 


gil8147612 


Homo sapiens 


metalloprotease disintegrin 


4204 


97 


1275 


gil3157560 


Homo sapiens 


dJ964F7.1 (novel disintegrin and 
reprolysin metalloproteinase family 
protein) 


3916 


97 


1276 


gi530876 


Chlamydomona 
s reinhardtii 


amino acid feature: Rod protein domain, 
aa 266 .. 468; amino acid feature: 
globular protein domain, aa 32 .. 265 


138 


35 


1276 


gil41852 


Actinomyces 
viscosus 


sialidase 


137 


30 


1276 


gi 13926258 


Arabidopsis 
thaliana 


AT5gl'0430/F12B17_220 


110 


34 


1277 


gil5291913 


Drosophila 
melanogaster 


LD31582p 1 


201 


36 


1277 


gi 16648042 


Drosophila 
melanogaster 


GH07105p 


131 


39 


1277 


gil64l6111 


Neurospora 
crassa 


related to suppressor protein SPT23 


129 


43 I 


1278 


gi544755 


Oryctolagus 
cuniculus 


aminopeptidase N; APN 


1016 


38 


1278 


gi525287 


Sus scrofa 


aminopeptidase N. 


1012 


39 


1278 


gi205109 


Rattus 
norvegicus 


kidney Zn-peptidase precursor 


1004 


39 


1279 


gil3559063 


Homo sapiens 


bA552M11.5 (novel protein) 


747 


100 


1279 


gi9963863 


Homo sapiens 


AD026 


738 


98 


1279 


gil9263987 


Homo sapiens 


similar to CMRF35 ANTIGEN 
PRECURSOR 


131 


32 


1280 


gi2773306 


Equus cabailus 


type II collagen 


69 


31 


1280 


gi3687594 


Canis familiaris 


type IIB procollagen 


69 


31 


1280 


gi8918871 


YccA of 
plasmid Collb- 
P9] [Plasmid F 


96 pet identical to gp:AB021078_30 


64 


26 


1281 


gi9927307 


Mus musculus 


junctophilin type 3 


59 


42 


1281 


gi5881591 


Gallus gallus 


homeodomain protein 


78 


38 


1281 


gill095167 


Bacteriophage 
AR1 


gp38 


76 


34 


1282 


gi!3938232 


Homo sapiens 


Similar to RIKEN cDNA 2610005H1 1 
gene 


78 


32 


1282 


gil3883774 


Mycobacterium 


NAD-dependent epimerase/dehydratase 


83 


31 
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— r ; — : 

tuberculosis 

CDC1551 


-z — r. r— 

family protein 






1282 


gi5881591 


Gallus gallus 


homeodomain protein 


7R 


1R 

JO 


1283 


gil3938232 


Homo sapiens 


Similar to RIKEN cDNA 2610005H1 1 
gene 


78 


32 


1283 


gil3883774 


Mycobacterium 

tuberculosis 

CDC1551 


NAD-dependent epimerase/dehydratase 
family protein 


83 


31 


1283 


'r O0 1 rt 1 

gi5881591 


Gallus gallus 


homeodomain protein 


/o 


IS 
JO 


1284 


gil5779156 


Homo sapiens 


Similar to RIKEN cDNA 1810073N04 
gene 


4057 


100 


1284 


gil3097045 


Mus musculus 


Similar to RIKEN cDNA 1810073N04 
gene 


1727 


91 


1284 


• 1 O A A o o 

gi 184473 8 8 


Drosophila 
melanogaster 


DCACQ/I/1« 

Rfc0jy44p 


71 A 


17 




glzlozoo /4 


Drosophila 
me i dnogasier 


priQAin PR 
L-U?4 1 U-iD 


1S4 


46 


1 9fi^ 


rr!*71ft99R1 


TirAc Anni 1 1 

urosopnua 
melanogaster 


PfW4 1 fl-P A 


354 


46 


1 90< 


glZi 100U50 


jjiciyosieuum 
discoideum 


INULlCObluc UipilUbpUavC tUilaoC 


164 


30 


I Zoo 


glZUy / /Ooo 


Xenopus laevis 


uimorneaci 


146 






giiyu/uozz 


Mus musculus 


nrntpi n P 49 POP 

lviyD proiein rHzrur 


119 


99 


1286 


gi9652255 


Ovis aries 


DNA binding protein pur-alpha 


76 


26 


1287 


giioooioz 


Visna virus 


envelope polyprotein 


O 1 


4R 


1287 


gi6469042 


Mus musculus 


C184M protein 


73 


28 


1287 


gi2098o388 


Mus musculus 


Similar to mammary tumor virus receptor 
z 


71 


9R 
zo 


1288 


gil2309630 


Homo sapiens 


bA438B23.1 (neuronal leucine-rich 
repeal proiein; 


319 


31 


1288 


gi6273399 


Homo sapiens 


melanoma-associated antigen MG50 


322 


31 


i no o 

1288 


gl 1504040 


Homo sapiens 


similar to D.melanogaster 
peroxidasin(U11052) 


JZZ 


11 


1289 


gl 1 6769274 


Drosophila 
melanogaster 


LUZZ4zjp 


777 
ZZZ 


94 


1289 


• i OIAA^I C 

gl 1 8700635 


Homo sapiens 


importin 4 


1 I J 


91 


1289 


gl 13277562 


Homo sapiens 


oimiiar to Kiivurs clun/y ohjih-uovjio 
gene 


1 1 1 
1 1 j 


91 


1290 


gi2l391486 


Mus musculus 


leucine-rich repeat domain-containing 
protein 


430 


43 


12VU 


glZlOZO /4U 


— — 

Rattus 

norvegicus 


r angina ft /-»!•» ranoof^nntoiniiKT TuvifrfMtl \ 

Lreucine-ncn repeai-coniaining proiein d 


49 S 


43 


izyu 


m'OI 1Q1 /HM 

glZliV 1484 


riomo sapiens 


leucine-ncn repeal uunidiii-cuiiuuiuiig 

nrntpi n 




39 




glZ lOZ'f jhU 


tiomo sapiens 




1611 


100 


1 OQ1 


£9/11/19 
glZlOZ4J4Z 


mus muscuius 


foramina InnocAC 

ceramiQe Kindses 




86 


1291 


ei 16768660 


Drosophila 
melanogaster 


HL01538p 


292 


41 


1292 


gi50369 


Mus musculus 


precursor protein (AA -34 to 244) 


204 


32 


1292 


gi312590 


Mus musculus 


biliary glycoprotein 


204 


32 


1292 


gi3549152 


Homo sapiens 


R29124 1 


187 


32 


1293 


gi50369 


Mus musculus 


precursor protein (AA -34 to 244) 


204 


32 


1293 


gi3 12590 


Mus musculus 


biliary glycoprotein 


204 


32 


1293 


gi3549152 


Homo sapiens 


R29124 1 


187 


32 


1294 


gi21411450 


Mus musculus 


similar to FLJ00179 protein 


1159 


91 
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1294 


gil 8676564 


Homo sapiens 


FLJ00179 protein 


993 


99 


1904 


ml 704^109 

gli /yHjjyZ 


Lvrosopniia 

m 1 3 tl A CT 51 C tp r 
ILlGlallUgablw 




486 


59 


1295 


gi7708438 


Homo sapiens 


dJ885A10.1 (similar to cerebellin 
nreeiir<$or^ 


1020 


100 


19Q*% 
\l>yj 


oi^709171 
glj /UZj / 1 


IVJLUo llllioV/Uluo 


nrpf*PTphel 1 in- 1 


699 


70 


19Q5 


cri 180951 
gl 1 OUZO 1 


noiuu bdpiOllb 


nrpp prpHpl 1 i n 

pi CvWl UU(/U1U 


696 


74 


1 90< 


ml 001 098 
glJ^UlUZo 


nomo sapienb 


npiirAtpncin rpppntAT* 9 

llCUlVJlVllOlll Xr 


1436 


100 




mi /iqi^qo 
gll'foJjoV/ 


JtvaUUS 

i> nn/p tn pi 1 q 

11U1 VC^lVUo 




1073 


76 


1906 


oi 1 76460Q6 


IvTnc mncpiihi<3 


low affinitv neurotensin receotor 


1072 


77 


1908 


ai669 455H 


Unmn canipnc 
11U1I1U oajpiciia 


dJ61B2 1 foullous Demnhieoid antieen 1 
(230/240kD) isoform 3) 


1342 
6 


100 


1 90S 


01401194 


HAmA ciini f>n c 
nUlUU oapivilo 


hullnim nemnhiffoid antiffen 


9121 


92 


1908 


cn 15077861 
gl I Ju / / OOl 


]vTiic miicpnliiQ 

lYllid IllUol^ttlUo 


bullous nemnhiffoid antiffen 1-e 


6442 


67 


190Q 
YLyy 


oi91 1 At 76 

gliOl l*rl /U 


Hatha cjtniprm 

XlAJLUU ixAJJlC/llO 


nQ7 hnmoloffous nrotein 

L7 ^7 / 1 1 vl 1 Iv 1 UtvU is? L#1%#4*V/A1A 


100 


23 


1299 


gil2654337 


Homo sapiens 


craniofacial development protein 1 


100 


23 


1900 


mll41 8QQ 


nOulO bd.pi.Cllb 


Dun i 


100 


23 


IjUU 


m65799Q4 
glOj /Z/.y £ f 


nomo bdpiciib 


HA969A1^ 1 Tnovel nrotein^ 


499 


100 


1 JVJU 


tn*9 1 6 1 Q4Q 1 


Haiti a campnc 
nUIHU bdpidlo 


<jimtl5ir tn pifnre*;<;ed senuence AW049604 


260 


42 


1300 


gi2460196 


Monodelphis 

UUIIlCbllUa 


immunoglobulin Igh@ variable domain 


65 


37 


1 1ft 1 


m'1 8676659 
gl loO /OOJZ 


nuillU bapivllo 


FT T0099S nrntein 


779 


100 


i^fti 


oi9619Q59 


Rar»i11nc otiaHIic 
E>dl/lUUb bUUUllo 


vehD 


66 


51 


1 ini 

1 Jul 


tn 9 074QQ47 
glZU fHyy 1 * l 


UlUbUpillla 

virilis 


friiitlpQQ plass 1 male isoform 


50 


40 


1109 


ml 8676659 
gl loO /OOJZ 


ilUIIlU bapiCIlb 


FT J00995 nrotein 


444 


97 


1302 


gi2632952 


Bacillus subtil is 


yebD 


59 


48 




gij^fzzy:* 


iVidCdva 
fa^pipnlari*; 


nrpnrncnmat Acta \\x\ 
pi cpuJbUlIlaLUotauil 


226 


100 


1303 


gi338288 


Homo sapiens 


preprosomatostatin I 


226 


100 


1 lOI 
1 JUJ 


«i91 6101 56 
glZLOli' 130 


nomo odpiCllb 


c Aim a t Actsit i n 


226 


100 


1304 


gil4249944 


Homo sapiens 


Similar to bromodomain-containing 4 


109 


30 


1304 


gi2865615 


Leishmania 
peruviana 


acidic ribosomal protein PI 


93 


36 


i irt>i 
1JU4 


gl3434DZ 


Tarsius 
bancanus 


mvoiucnn 


114 


24 


1305 


glZ 190*4 


Homo sapiens 


5Uiv-L« proxem 


124 


26 


1305 


gi 187387 


Homo sapiens 


myristoylated alanine-rich C-kinase 
substrate 


122 


26 


1305 


gll3562004 


Nepnila 

madagascariens 
is 


major ampullate spidroin 2-like protein 


140 


jj 


1306 


gi21744725 


Homo sapiens 


glycosyl-phosphatidyl-inositol-MAM 


1548 


48 


1306 


gi7529597 


Homo sapiens 


0J4uzinzi.z ^novei protein wiin £vi/vivi 
domain^ 


6S7 


j j 


1306 


gi7529598 


Homo sapiens 


dJ402N21.3 (novel protein with 
Immunoglobulin domains) 


591 


52 


1307 


gi4455102 


Brassica rapa 


pollen-specific protein BAN102 


72 


44 


1307 


gi4096227 


Oryctolagus 
cuniculus 


Ig heavy chain 


68 


31 


1307 


gil7017359 


Talaromyces 
emersonii 


60S ribosomal protein L2 


60 


43 


1308 


gil7429038 


Ralstonia 


PROBABLE ACYL-COA 


1166 


56 
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solanacearum 


DEHYDROGENASE 






1308 


gi9948609 


Pseudomonas 
aeruginosa 


probable acyl-CoA dehydrogenase 


1121 


57 


1308 


gil3421911 


Caulobacter 
crescentus 


acyi-CoA dehydrogenase family protein 


1058 


54 


1309 


gll /4/9UJo 


Ralstonia 
solanacearum 


nnAD ADT t; ApVT fVlA 

nPfUVDR A^RM A QP 
L/tlri I AoD 

OXIDOREDUCTASE PROTEIN 


1 100 


DO 


i inn 

i3uy 




P seudomonas 
aeruginosa 


prooaoie acyi-^OA uenyurogenase 


1 191 


57 


i mo 


rr." 11/10101 1 


Caulobacter 

cicsccruub 

CB15 


acyi-^OA uenyurogenabe lainiiy pruiciu 


105R 

IUJO 


54 


1310 


gi 19070124 


Mus musculus 


zinc transporter-like 3 protein 


1087 


95 


1 1 1 A 




Mus musculus 


zinc transporter o 


1075 

lv / J 


04 


1 1 1 A 




Caenorhabditis 

PI AfTOtlP 

cicgdns 


L/. eiegans i ul- i proiem ^corresponuing 


970 
z iy 


1R 1 


1311 


gi854065 


Human 
herpesvirus 6 


U88 


260 


33 


1311 


gi21928439 


Homo sapiens 


seven transmembrane helix receptor 


174 


29 


1 1 1 1 

1311 


glloo93z4o 


Pyrococcus 
furiosus DSM 

"Xf^X R 
jOjO 


smc-like 


177 


94 
Z*f 


1 ^19 

1 J IZ 




Unrnn com one 

numo Sapiens 


/1T9101R 9 ^ r\rr\tp>in cimilar tr» orXXckOf^viS 
UJZLVylO.Z ^piutcin omuiai ikj cutidgcuy 


1142 


100 


1312 


gi6526769 


Homo sapiens 


HRIHFB2003 


1055 


97 


1110 
1 J 1Z 


rri79Q1 40R 
gl/zy l*HJo 


urosopmia 

molunnoroctpi* 
IILCldllUUaSLCI 


P01 1 906 PA 
vAji l zuo~r/\ 


/ JO 


41 


1313 


gi 19263985 


Homo sapiens 


Similar to RIKEN cDNA 1300017E09 
gene 


1565 


99 


1 J 1 0 


a\ 1Q59R^OO 


rirncrtnlit 1 n 
L/lUSUpillla 

m f*\ a n n on cfrpr 

1 1 1CI ol 1 Ugdo Id 


T D09^10n 


573 


55 


1313 


gi7l06870 


Homo sapiens 


HSPC240 


227 


30 




oi99000696 
gizzuyuozo 


riuinu sapiens 


HFPT Hntnain nrntpin f A^Tfl 

FLCrV^i UUlllaill plULvlll L>n.JU 1 


1169 

o 


99 




ai6841 104 


Hnmn cqnipn e 


HSPC272 


9665 


99 


1314 


gi20151907 


Drosophila 

mplannoactpr 
I i 1 v I a l l\J god t CI 


SD03277p 


1833 


75 


m 5 


oi?1 549541 


Hnmn cjvnipnc 
XTvFlUU oajJiciio 


Similar to HTPAP nrotein 

kJ 11111 lul ILr 1111 /VI fJLV/ltsIll 


766 


100 


10 I J 


oi n 1 R9757 

Ul OZr 1 J 1 


iiUlllU oaJJiCllo 


HTPAP 

1111 Al 


473 


100 


m 5 


<H 14020949 


A rnhiHnnQic 
thaliana 


nho<:nhatidic acid nhosnhatase 


317 


50 


ni6 

U1U 


tri2 1542541 




Similar to HTPAP nrotein 

oniiiioi iu nil ru. piviviii 


1204 


99 


1316 


gil3182757 


Homo sapiens 


HTPAP 


915 


100 


1^16 


en' 1 4090040 


/vrduiuopsis 

thaliana 


UllUSUllallUiC aOlU |jitvd^ll<lUloC 


460 


41 


1317 


gil80164 


Homo sapiens 


CD7 antigen protein 


1135 


93 


1317 


gi732757 


Homo sapiens 


CD7 antigen 


1135 


93 


1317 


gi 14424540 


Homo sapiens 


CD7 antigen (p41) 


1135 


93 


1319 


gi 16416764 


Homo sapiens 


FBCSG16 


2369 


99 


1319 


gil3905212 


Mus musculus 


RIKEN cDNA 1200006F02 gene 


1833 


75 


1319 


gi 14715055 


Homo sapiens 


Similar to RIKEN cDNA 1 1 10002C08 
gene 


418 


32 


1320 


gil6416764 


Homo sapiens 


FKSG16 


323 


98 
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score 
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laenuiy 


1320 


m139052l2 


1Mn<5 "mncpnliiQ 


RIKENcDNA 1900006F02 ffene 

i\jLivj->i> i^L/i iz,\j\j\j\jyjx yL gene 


957 

J*0 I 


77 


1320 


pi 1471 




Similar to RTKFN rDMA 1 1 1 00O2P0R 
gene 


07 


00 


1391 




Doff no 

n orvperi pi 1 0 


pruiiiic arginine-n.cn enu leucine -ncn 
rpttpnf nrntpin 


^09 


OL 


1321 


oi9 161 8473 


Hritnn QflnipnQ 

X 1U111U oapiVslla 


nrnlinp arffimnp— ripli pnH Ipiipitip— ftpVi 
piuuuc aigiiiiuc~i 11*11 ciivi lcuL-iiic-i lun 

repeat protein 


3X0 
oa>y 


39 

JZ 


1321 


m 11 45773 


llUUlU aaUlCtlo 


piuioigiu 


3 SO 
ooy 


oz. 


1322 


m 2025 8604 


T-Tnmn citnipnc 


ci a lip opiH hin/^incr Tor— lilfp lppfrn 5 
dial IV/ awiu uiuuiiig A g- line icirUil J 


1473 

1*T 10 


OH 


1322 


ffi241 1475 




OR KitiHino* nrrvtpin-9 


14.73 

1*T/ O 


RA 
OH 


1322 


gi5759106 


Homo sapiens 


sialic acid binding Ig-like lectin-5; siglec- 
5 


1473 


84 


1323 


oi9095R604 


nuuiu bap 1 ens 


bidac dciu Dinuing ig-uKe iccnn d 


1 ^75 

10 / o 


87 
o / 


1323 


cn'941 1475 


XIUIIIU b dpi CI la 


\jd uinuing proiem~z 


1 ^7^ 
10 ID 


87 


1323 




nuiiiu bapiCllb 


biaue aeia Dinumg ig-iiive lecun-j, sigicc- 
5 


10 /O 


87 


1394 


<n'9O05?775O 


nomo bctpiLub 


oimuar 10 auaivi i o-i iKe 1 


ooO 


QO 

yy 


1 394 


oi 1 5000Q9 1 
gn o\jyyy&\ 


nomo Sapiens 


auam- i o reiaxeu protein i 




yo 


1324 


gil3183078 


Homo sapiens 


a disintegrin-like and metalloprotease 
uuiiidin wiui inromoosponuin type i 
motifs-like 3 


603 


73 


1326 


gi757915 


Homo sapiens 


apoCII protein 


427 


89 


1326 


ot17S836 


t-Ip»m n coniAnc 
OUIUU bd.pi CUb 


dpuupoproiein v^-ii 


All 


80 
oy 


1326 


gi342077 


Macaca 

labLlCUIdilS 


apolipoprotein C-II 


371 


78 


1327 


gi216 19424 


Homo sapiens 


Similar to LOCI 50580 


477 


100 


1 JZ / 




Plasmodium 
falciparum 


erythrocyte membrane protein 1 


63 


25 


1 177 


ml ^18/Lft9Q 

gllDjOH-UZy 


uncultured 
crenarcnae oxe 
74A4 


extracellular protein 


64 


*5 1 

31 


1 390 


ml 6033 507 


nomo sapiens 


SH2 domain-containing phosphatase 
anchor protein 2d 


1UUJ 


GO 


1390 


m 16033 50 1 

gl 1UUJJJ7 I 


nomo Sapiens 


oriz Gomain-conxaimng pnospnaiase 
oiii/iiui pioieiii ZrU 


Q01 


yy 


1329 


gil8092655 


Homo sapiens 


immunoglobulin superfamily receptor 

fTJinclrtpati An ciccAPiiifp/i nmt&m *\ 
U Oll&lUVClUUil aooUtldLLU piULClll D 


985 


99 


1330 


gi4877582 


Homo sapiens 


lipoma HMGIC fusion partner 


728 


63 




m 1497991^ 
gJlHZ/ZZjj 


nomo sapiens 


DAiooLfO. 1 ^lipoma nMulU iusion 
partner) 


44j 


ol 


1 nn 


eril ^909/117 


jjrosopnua 
meidnogasier 


Lr lUZ/Zp 


lo/ 


Zj 


1331 


gil7426418 


Mus musculus 


calmodulin-related protein 


788 


100 




gl IZUOUoZO 


Homo sapiens 


serologically defined breast cancer 
antigen in y-dk-zu 


610 


77 


1111 
1 00 1 


triSQ19A98 


iviyxine 
glutinosa 


calmodulin 


Jlo 


44 


1332 


gi 17862436 


Drosophila 
melanogaster 


LD27564p 


152 


26 


1332 


gi 133 11009 


Homo sapiens 


NYD-SP16 


78 


26 


1333 


gil3279251 


Homo sapiens 


Similar to wingless-related MMTV 
integration site 6 


2000 


100 


1333 


gill 693044 


Homo sapiens 


WNT6 precursor 


2000 


100 


1333 


gil4133265 


Homo sapiens 


WNT6 


2000 


100 


1334 


gi20l35611 


Homo sapiens 


zinc transporter ZnT-5 


463 


94 
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flu LLP 


apecies 
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score 


pArppn tiityp 
jr ci ecu iagc 

identity 


1 11A 


oi 1 0744^04 


Urtmrt pom one 

no mo Sa.pi cub 


•71HP tTun Qnrtrfpr ^ 


463 ! 


94 


1^4 


oil 07 AA^Ofy 


AAnc tyiiic/*iiiiic 

iviuo rnuo(/Uiuo 




407 


85 


L j J J 


oil R4Rfn6fi 


lviUd rnusuuiuij 


nl fartnrv rpppntnr N/fOR 1 45- 1 


310 


74 


1335 


gi21928214 


Homo sapiens 


seven transmembrane helix receptor 


301 


77 


1330 


ml A AT) 1 Q 
glZ44/Ziy 


Homo sapiens 


nr F4 




71 


1330 


gizuyooooo 


Homo sapiens 


protein inniDiior oi acuvateu oiaij 




inn ! 


1336 


gi4996563 


Homo sapiens 


protein inhibitor of activatied STAT3 


3277 


100 


1336 


gl 17 149822 


Rattus 
norvegicus 


potassium channel regulatory protein 


jZ 1 1 


yo 


1337 


gi4469173 ! 


Gallus gallus 


delta-9 desaturase 


1 1 AQ 


71 


1337 


giiyyoozoo 


Chanos chanos 


stearoyl-CoA desaturase 






1337 


gi5738564 


Ctenopharyngo 
don idella 


deita-9-desaturase 


1132 


70 


1338 


gil4030861 


Homo sapiens 


paraneoplastic neuronal antigen MAI 


1830 


99 


1338 


* 1 O AIO C CH 

gil8478557 


Rattus 
norvegicus 


paraneoplastic onconeuronal protein MAI 


1 1KO 
1 /DZ 


y3 


1338 


• -f f AAA 1 OI 

gil59291o3 


Homo sapiens 


modulator of apoptosis 1 


yyu 


DO 


1339 


gi5452942 


Mus musculus 


glucosidase II beta-subunit 


134 


56 


1339 


gil63157 


Bos taurus 


higli-mobility-group protein 


i on 
lZu 


A1 

43 


1339 


gil5076513 


Mus musculus 


22 kDa neuronal tissue-enriched acidic 
protein . 


in 
131 


zo 


1341 


gill 177514 


Homo sapiens 


tandem pore domain potassium channel 

TLJTV 1 

1 H1K.-2 


2234 


100 


1 O A 1 

1341 


gill 177510 




Rattus 

norvegicus 


tandem pore domain potassium channel 
1 rtiiv-z 


ZZ1 J 


OR 


1341 


gUDZl j303 


Homo sapiens 


potassium cndnnci, bUDidniiiy iv, nicinucr 

V\ 
u 


1J*tU 




1342 


gii4336716 


Homo sapiens 


similar to FBan0003337 


1216 


100 




glZl/y 5/330 


ivius muscuius 


rviJveiN cliin/v Jr\yj\j\j lurzi gene 


427 


50 


1342 


gil9886829 


Methanopyrus 
KanQien Avi7 


S AM-dependent methyltransferase 


104 


31 


1 1A1 

1343 


giiyj fyjoyo 


Homo sapiens 


IiUUMjO 


i ns 

1 LJO 




1343 


gi l Loozboy 


Mus musculus 


DUNLjO 




4^ 


1 1A 1 

1343 


gn ioozy4i 


mus muscuius 


UUIVIjOD 




4^ 
t j 


1344 


gi2 1744725 


Homo sapiens 


glycosyl-phosphatidyl-inositol-MAM 


4898 


98 


i n a a 

1344 


gi/ozy^yo 


Homo sapiens 


0J4UZINZ i .3 v,novei protein witn 
Immunoglobulin domains) 


1 SdR 


QQ 


1 *> A A 

1344 


gi7529597 


Homo sapiens 


aJ4iiziNZi.z ^novei protein witn jyiaivi 
domain) 


1 10 1 


04 


1 AC 

1345 


gll2Z7oiyo 


Homo sapiens 


rtvoU4U 


lUZU 


i no 


1 1 A C 

1345 


gll240oZDU 


Homo sapiens 


rKoOZo 


1 n?n 

1UZU 


inn 


1 1 A C 

1345 


gl 18652934 


Xenopus laevis 


Mig3U 




40 


1346 


gl 1 6769552 


Drosophila 
melano^ster 


LL)3o375p 


1 

13D4 


41 
41 


1346 


gi7523707 


Arabidopsis 

Uldilalld 


Putative membrane protein 


line 
1 IUj 


3y 


1346 


gil632829 


Plasmodium 
falciparum 


AARP2 protein 


467 


36 


1347 


gi20987450 


Homo sapiens 


LOC 146433 


1162 


95 


1347 


gi3093373 


Mus muscuius 


small proline-rich protein 21 


64 


39 


1347 


gi912799 


Homo sapiens 


type I hair keratin 


63 


33 


1348 


gil016012 


Rattus 
norvegicus 


neural cell adhesion protein BIG-2 
precursor 


5093 


93 i 


1348 


gi!9913548 


Homo sapiens 


similar to axonal-associated cell adhesion 


3630 


99 
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molecule 






1 ^48 


glZUwD / 


ivius museums 


neuronal glycoprotein 


3630 


64 


iDty 


ml 59Q9437 


L^rosopnua 
mcidnogds Lcr 


T PI 0979r» 


AAA 
441 


39 


H4Q 


m4R77S89 

gl*+0 / IDOL 


norno sdpicns 


lipoma nivioiLx iusion partner 


n i 


28 




oi 1 66484S4 


Lsl Uo up 111 la 

tn pi an n OTi Qtpr 
iii^/iaiivjgaot&L 


OU\J 1 ZO JU 


1 6T> 


24 


1350 


ei 13097705 


Hnmn sanipn^ 

Li\Jlll\J oa|Jit/iio 


owiiiie ^ui wjroieiucj uiv/ieiiiose liiniuiiur, 

plsHp A 5>nfir»rnt#»TnQCA 
vifluc rx. ^aipxia- l alllipi uiciiiaoC, 

antitrvnsin^ member 3 


1Q9S 
YyLD 


07 


1350 


gil340142 


Homo sapiens 


alphal -antichymotrypsin 


1921 


97 


1350 


gi4 165890 


Homo saniens 


2mn3- 1 -flnHrtli vmntrvn^in nrppnrcnr 


1 OJU 


Q7 


1351 


gi21618556 


Homo saniens 


tronfiinin fl<;sopi afpH nrofpin /'tnQttrA 
uwpiiiiiiji aoouviaivu (JiULeiu ^laoiitij 


^1 ^4 


R4 


1351 


ei905356 


Homo saniens 


tastin 


D lAy 


R4 

0*1- 


1351 


gi7861746 


Mus musculus 


GABA-A receptor epsilon-like subunit 


165 


40 


1352 




XlvUlU oaplvllo 


uivc v protein 


1 £fiG 

locy 


i on 


1352 


gil2053851 


Homo sapiens 


DREV1 protein 


1676 


99 


IDDZ, 


ai190SSfiQ1 
gl YL\JDD\jy 1 


K Alio t>^ iinoi iliYO 

mus muscuius 


LiKJtiV protein 


1655 


97 


1353 


gi 14627081 


Homo sapiens 


caspase-1 dominant-negative inhibitor 
rseuQO-i^n 


492 


100 


1351 


<n?1707^^S 

glZi 1 /v / JDJ 


nuinu sapiens 


oimuar to v^/\jvij oniy protein 


40Z 


1 OA 




gl lOUZOO 


nomo Sapiens 


inteneuKin l-oeta convertase 


AAZ 


oo 
92 


1354 


gil7431573 


Ralstonia 

oUlaUaUCui UI11 


PUTATIVE LIPOPROTEIN 

I IVfVIN o 1V1C 1V1D rvrtiN o 


82 


42 


1354 


gi995704 


Saccharomyces 

LClCVlSldC 


L3149 


69 


23 


1354 


gl l/iJU077 


odi uiiai om yc es 
cerevisiap 


I ni Jowp 


/CO 


oi 
/j 


1355 


gil2034719 


Mus musculus 


ankyrin-like protein 


413 


43 


H55 


m 1 3469729 

gl lJtl/7 / 


nuinu Sapiens 


orcdsi cdncer anugen in i -djx- i 


A1 < 


Ad 
*ty 


ujj 


glX lUlOJOO 


nuiiiu Sapiens 


lesiis-speciiic amcyrm motii containing 

nrnfpin 
pi vjieiii 


ioz 


40 


1356 


ei827'>557 


Rattnq 
norvegicus 


nrAfpin VinaQP W^FK*1 
fjivLeiu iviiiaoc vv i it\.i 




7T 
/ j 


1356 


gi6933864 


Homo sapiens 


kinase deficient nrotein fCOP 


^408 


100 


1356 


gil9032238 


Homo sapiens 


protein kinase WNK3 


1664 


56 


1357 


ei8272557 


Rati: u<5 
norvegicus 


nrofpin lnna<:p \A/7\IIC 1 
jji viLem Kiiiaov vv i>( rv. i 


DHjy 


7T 


1357 


2i6933864 


Homo canipnc 
iiumu oaijieiio 


iviiiaoe ueiiuieui piuicm r\ur 


I 1 

I I Dy 


yo 


1357 


gi 19032238 


Homo sapiens 


protein kinase WNK3 


530 


40 


1358 


en 1 0946203 


nuiiiu sapiens 


neuromeuin u recepior z 


7Q< 


1 OA 


1358 


gi9944990 


Homo sapiens 


neuromedin U receptor-type 2 


785 


100 




<ri 1 £877177 


Homo sapiens 


neuromedin U receptor 2 


"70C 

785 


100 


1359 


gil7861592 


Drosophila 
melanogaster 


GH13807p 


1234 


45 


1359 


gil 8376566 


Caenorhabditt<; 
elegans 


Y105E8A 20 


~U*T 


40 


1359 


gi9368514 


Leishmania 
major 


methionyl-tRNA synthetase 


963 


42 


1360 


gi 173899 19 


Homo sapiens 


Similar to major histocompatibility 
complex, class II, DP beta 1 


819 


100 


1360 


gi575494 


Homo sapiens 


MHC class II lymphocyte antigen beta 
chain 


437 


72 


1360 


gil 88479 


Homo sapiens 


HLA-DPB 1 


437 


72 
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c 

o • 

score 


— -i 

Percentage 

identity 


1361 


gi3342737 


Homo sapiens 


R26660_2, partial CDS 


1025 


97 


1361 


gil4625940 


Homo ^aniens 


interlpulrin-1 O 


AO 




1361 


gi3005997 


okra yellow 
vein mosiaic 

' will lllvOUlv 

virus 


AC2 


77 


35 


1362 


gi3342737 


Homo saniens 


R26660 2 nartial CDS 


JLUU 1 


OA 


1362 


gi 14625940 


Homo sanien^! 


interleukiTi-1 0 


AO 


^7 
jj 


1362 


gi3005997 


okra yellow 
vein mosaic 
virus 


AC2 


77 

/ / 


JJ 


1363 


gi 1399 1167 


Homo sapiens 


sialic acid-bindinff immunofflohulin-likp' 
lectin-Iike long splice variant 


9879 
£>o iy 




1363 


gil4625822 


Homo sapiens 


Siglec-Ll 


2879 


09 

yy 


1363 


gil5824310 


Pan troglodytes 


sialic acid-bind in 2 lectin Sielec-Ll 


2804 


07 


1364 


gi20072749 


Homo sapiens 


similar to interferon alpha/beta receptor 1 


879 




1364 


gi571296 


Homo sapiens 

B T v !z 


CRFB4 


188 

1 oo 


97 


1364 


gi4028135 


Gailus gallus 


interferon alpha/beta receptor 1 


195 


27 


1365 


gi8572055 


Homo sapiens 


interleukin-1 rerentnr antaprini^tlinmfvlncr 

1 


891 


J.UU 


1365 


gi6049805 


Homo sapiens 


interleukin-1 reeentor antagonist n^Ymn1no■ 


89 ^ 


100 


1365 


gi6165334 


Homo sapiens 


interleukin-1 -like nrntein-1 

lllVVllvUlu.il L * i IVv L/l ULvlll X 


891 


100 


1366 


gi 177870 


Homo sapiens 


ai nn a-9-»rn a nrnpl nbi 1 1 i n nrppi i j*q oi* 


9780 


AO 


1366 


gi579594 


Homo sapiens 


aloha 2-macroffIobulin 690-740 


977S 


Art 


1366 


gi579592 


Homo sapiens 


alpha 2-macroglobulin 690-730 


2774 


40 


1367 


gi4574224 . 


Fundulus 
heteroclitus 


multidrug resistance transporter homolog 


287 


49 


1367 


gi 19743730 


Rattus 
norvegicus 


ATP-binding cassette protein Bib 


285 


50 


1367 


gi34525 


Homo sapiens 


P-glycoprotein (43 1 AA) 


273 


50 


1368 


gil98922 


Mus musculus 


lymphocyte differentiation antigen 


713 


100 


1368 


gi!98926 


Mus musculus 


Ly-6A.2 alloantigen 


713 


100 


1368 


gil98930 


Mus musculus 


differentiation antigen Ly-6E/A 


713 


100 
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SEQJD 


HitJD 


Species 


Description 


S_score 


Percentage^ 
laeniuy 


no c 

685 


gil83150 


: 

Homo sapiens 


cnononic somatonianTmotroprn 
P<3.5 




100 


685 


gl23271 17U 


Homo sapiens 


cnononic somatomammotropin 

finrmntip 9 


97S 
z / J 




Ooj 


glZoioo/4.5 


r an trogiouyies 


•nlopAntcil lantAUfm PT -R 
JJlaUCUlal laULUgdl a -U 


279 


98 


OoO 


fri! 8^178 

glloJ 1 /o 


t-Ty\m f\ eompnc 1 

riomo Sapiens 


1 IVJl 1 V Xr 


1033 


78 


OoO 


gl/j// 1 1 /u 


no mo sapiens 


r*liAri Anir* caai QtATYiPmm aItaaiti 

hormone 2 


707 


92 


68o 


'OO 1 QC7/11 

glzoIoo/43 


Pan troglodytes 


placental lactogen r l^-d 


715 


94 


6oo 


1 8A8B8QA 

glloUoooju 


riomo sapiens 


A A H90756 


785 


100 


OOO 


m 1 Ql 1 T8 

glloJl /o 


riomo sapiens 


nvjn- v z 


1051 


79 


/TOO 

OBo 


gioujozoy i 


riomo Sapiens 




785 


100 


oay 


gllZOjjDUl 


riomo sapiens 


orirsx iiNr i protein 


2003 


95 


ooy 


gljUOojZoJ 


riomo sapiens 


m ptnKpr 1 

j ineinucL i 


2003 


95 




gloUOoDO 1 1 


syntneiic construut 


, 111CI11UC1 1 


2003 


95 




gizuzoyyo / 


oils scrota 


delta 4 


1033 


88 


£QA 

oyu 


glZl.5U/OiU 


ivius muscuius 


pnospnonpabc uena *t 


909 


77 


690 


gi571466 


Rattus norvegicus 


phospholipase C delta-4 


893 


76 


691 


gll /o04Uz3 


Homo sapiens 




^574 


100 


691 


glZZ/OU3o5 


Homo sapiens 


unnamed protein prouuet 


J J 1 J 


99 


/:a i 

oyl 


gizz/oiuio 


Homo sapiens 


unnameu proieui prouuui 


3524 


100 


69 z 


giLZoy /y^j 


Homo sapiens 


isj./\/\ ioy*+ protein 


3850 1 


100 


692 


r^OAICfiAIA 


Mus muscuius 


H7j jhu /oujivik protein 




98 

70 


£AO 

oyz 


glZ/0Dzj4/ 


Homo sapiens 


truncdtea c-iviai~iiiuuciiig 
protein 




99 

yy 


oyj 


gl4o /OOZ 


Oryctolagus 
cuniculus 


inforl<»iilrin— fi i*PPAnfrtf ci l At"\/Af* 

inteneutvin**o reeepior ouuiypc 
B 


1 88 

iOO 


61 


693 


m 'C11 OA1 

glO 1 lOKJJ 


Homo sapiens 


Jnt^f*l<aiilrtn_fi rpppntnr tunp R 

inieneuKin**o receptor type d 


179 


57 


693 


gi576679 


Homo sapiens 


interleukin 8 receptor B 


111 


57 


694 


gl3z9ooUo9 


Homo sapiens 


L»L>jyLiZ nucieotiaase 


95 14 


99 

yy 


694 


gi3335U98 


Homo sapiens 


\^L>jyL<z 


9^90 
ZjZO 


100 

1UO 


694 


gi4691263 


Homo sapiens 




2513 


99 


695 


gil6566319 


Homo sapiens 


A C/l 111 A*7 1 O wrnfoin 

Ar4 li iu /_i u protein- 
coupled receptor 




QQ 

yy 


695 


AA O^H A 

gi2 1928620 


Homo sapiens 


seven transmembrane helix 
receptor 


to JO 


100 


695 


gi22293641 


Homo sapiens 


putative orphan G protein- 
coupled receptor 26 


845 


51 


696 


gi24660226 


Homo sapiens 


C-type lectin-like receptor- 1 


1460 


90 


696 


gi71 10216 


Homo sapiens 


AF200949_1 C-type lectin-like 
receptor- 1 


1458 


90 


696 


gi71 10218 


Mus muscuius 


AF201457_1 C-type iectm-uke 
receptor 2 


ill 




698 


gil8089247 


Homo sapiens 


AAH20966 Similar to 
ecionucieosioe mpnospnate 
diphosphohydrolase 5 


2104 


100 


698 


gi30584801 


synthetic construct 


Homo sapiens ectonucleoside 
triphosphate 
diphosphohydrolase 5 


2104 


100 


698 


gi3335102 


Homo sapiens 


CD39L4 


2104 


100 


699 


gi804761 


Homo sapiens 


putative 


247 


77 


700 


gi 16 184225 


Drosophila 
melanogaster 


LD24527p 


666 


42 
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Identity 


700 


gi27447597 


Drosophila 
melanogaster 


transcriptional adapter 2S 


666 


42 


700 


gi7298997 


Drosophila 
melanogaster 


CG9638-PA 


666 


42 


701 


gil7225457 


Homo sapiens 


AF326917_1 autism-related 
protein 1 


1272 


36 


701 


gi27817314 


Danio rerio 




1234 


36 


701 


gi29468246 


Homo sapiens 


XTP9 


3605 


99 


702 


gi20810589 


Homo sapiens 


similar to arsenite inducible 
RNA associated protein 


833 


99 


702 


gi22945274 


Drosophila 
melanogaster 


CG12795-PA 


455 


54 


702 


gi9651711 


Mus musculus 


AF224494_1 arsenite inducible 
RNA associated protein 


687 


80 


703 


gil3241652 


Rattus norvegicus 


AF309558_1 supernatant 
protein factor 


2040 


93 


703 


gi 13543 184 


Mus musculus 


SEC14-like2 


2038 


93 


703 


gi6624130 


Rattus norvegicus 


AC004832_1 similar to 45 kDa 
secretory protein 


2150 


100 


704 


gill 066250 


Homo sapiens 


AF197937J presenilis 
associated rhomboid-like 
protein 


1693 


86 


704 


gi 13 177766 


Homo sapiens 


AAH03653 Similar to 
presenilins associated 
rhomboid-like protein 


1761 


99 


704 


gil5559382 


Homo sapiens 


AAH14058 presenilins 
associated rhomboid-like 
protein 


1696 


86 


705 


gil864091 


Rattus norvegicus 


PSD-95/SAP90-associated ! 
protein-3 


4997 


95 j 


705 


gi2454510 


Homo sapiens 


PSD-95/SAP90-associated 
protein-2 


2105 


47 


705 


gi6979175 


Homo sapiens 


AF119818J homolog- 
associated protein 2 


2089 


47 j 


706 


gill 877274 


Homo sapiens 




2260 


99 


706 


gi21667210 


Homo sapiens 


AF465765J 
bactericidal/permeability- 
increasing protein-like 1 


2260 


99 


706 


gi2 1706776 


Homo sapiens 


Bactericidal/permeability- 
increasing protein-like 1 


2253 


99 


707 


' 1 /'•n/'Ci 1 Art 

gil 6768190 


Drosophila 
melanogaster 


GH22974p ^ 


647 


41 


707 


gi24659527 


Homo sapiens 




2006 


100 


707 


gi7291716 


Drosophila 
melanogaster 


CG11388-PA 


648 


41 


708 


gil4334082 


Mus musculus 


AF367970_1 thymus LIM 
protein l L,r-A 


479 


87 


708 


gi 14335908 


Mus musculus 


thymus LIM protein TLP-A 


479 


87 


708 


gi!4335909 


Mus musculus 


thymus LIM protein TLP-B 


396 


90 


709 


gi'12804105 


Homo sapiens 


AAH02905 Similar to 
CG15084 gene product 


2090 


100 


709 


gil3649459 


Homo sapiens 


AF250306J putative SB1 15 
protein 


2090 


100 


709 


gi!8204670 


Mus musculus 


4930527Dl5Rik protein 


1015 


96 


710 


gil674440 


Homo sapiens 


collagen type IV a6 chain 


4222 


51 
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Identity 


710 


gi 1674441 


Homo sapiens 


collagen type IV a6 chain 


4222 


51 


710 


gi556299 


Mus musculus 


alpha-2 type IV collagen 


8126 


83 


711 


gi438007 


Gallus gallus 


atpha-2-macroglobuIin receptor 


15742 


60 


711 


gi7861733 


Homo sapiens 


AF176832_1 low density 


23654 


99 








lipoprotein receptor related 












protein-deleted in tumor 






711 


gi8926243 


Mus musculus 


AF270884_1 low density 


23098 


92 








lipoprotein receptor related 












protein LRP1B/LRP-DIT 






712 


gil7298315 


Homo sapiens 


candidate tumor suppressor 


848 


100 








protein 






712 


gi7861733 


Homo sapiens 


AF176832_1 low density 


848 


100 








lipoprotein receptor related 












protein-deleted in tumor 






712 


gi8926243 


Mus musculus 


AF270884_1 low density 


731 


83 








lipoprotein receptor related 












protein LRP1B/LRP-DIT 






713 


gil3544080 


Homo sapiens 


AAH06171 hypothetical 


1133 


100 








protein MGC2731 






713 


gi20071811 


Mus musculus 


583041 lElORik protein 


492 


55 


713 


gi33589496 


Drosophila 


LD31278p 


401 


44 






melanogaster 








714 


gil57409 


Drosophila 


fat protein 


3001 


40 






melanogaster 








714 


gi22945533 


Drosophila 


CG17941-PA 


2292 


34 






melanogaster 








714 


gi7295732 


Drosophila 


CG3352-PA 


3015 


40 






melanogaster 








715 


gi 157409 


Drosophila 


fat protein 


3007 


40 






melanogaster 








715 


gi22945533 


Drosophila 


CG17941-PA 


2289 


34 






melanogaster 








715 


gi7295732 


Drosophila 


CG3352-PA 


3021 


40 






melanogaster 








716 


gi!786531i 


Homo sapiens 


AF452 102_1 dipeptidyl 


4370 


95 








peptidase-like protein 9 






716 


gi27549552 


Homo sapiens 


dipeptidyl peptidase IV-reiated 


4370 


95 








protein-2 






716 


gi29293087 


Homo sapiens 


dipeptidyl peptidase 9 


4511 


95 


717 


gi2689444 


Homo sapiens 


ZNF134 


1252 


57 


717 


gi3 1565347 


Homo sapiens 


LOC2840 18 protein 


1252 


57 


717 


gi9968290 


Homo sapiens 


zinc finger protein 304 


1094 


47 


718 


gi23468368 


Mus musculus 


1200013F24Rik protein 


690 


90 


718 


gi27695305 


Mus musculus 


1200013F24Rik protein 


715 


91 


718 


gi75 82294 


Homo sapiens 


AF208853 1 BM-011 


881 


100 


719 


gil620870 


Ciona intestinalis 


myoplasmin-Cl 


410 


27 


719 


gi74 16982 


Argopecten irradians 


myosin heavy chain cardiac 


255 


20 








muscle specific isoform 1 






719 


gi7416983 


Argopecten irradians 


myosin heavy chain cardiac 


255 


20 








muscle specific isoform 2 






720 


gi 138728 13 


Homo sapiens 


fibulin-6 


13764 


100 


720 


gi 14575679 


Homo sapiens 


AF156100 1 hemicentin 


13720 


99 


720 


gi3879658 


Caenorhabditis 




1636 


29 






elegans 








721 


gil3177673 


Homo sapiens 


AAH03621 


1520 


45 
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721 


gi 19354327 


Homo sapiens 




1520 


45 


721 


gi3822553 


Gallus gallus 


nuclear calmodulin-binding 
protein 


2238 


61 


722 


gil7223626 


Homo sapiens 


ATP-binding cassette A 10 


7963 


99 


722 


gi32350914 


Homo sapiens 


ATP-binding cassette sub- 
family A member 10 


7943 


99 


722 


gi32350969 


Homo sapiens 


ATP-binding cassette sub- 
family A member 10 


7943 


99 


723 


gi 13374079 


Homo sapiens 


TAFII140 protein 


3677 


99 


723 


gil3374178 


Mus musculus 


TAFII140 protein 


3193 


84 


723 


gi28175603 


Homo sapiens 


TAF3 protein 


2772 


99 


724 


gil7429038 


Ralstonia 
solanacearum 


PROBABLE ACYL-COA 
DEHYDROGENASE 
OXIDOREDUCTASE 
PROTEIN 


658 


61 


724 


gi22776354 


Oceanobacillus 
iheyensis HTE831 


acyl-CoA dehydrogenase 


638 


63 


724 


gi28280023 


Mus musculus 


5730439E10Rik protein 


946 


85 


725 


gi21522768 


Homo sapiens 


unnamed protein product 


3060 


100 


725 


gi24047224 


Homo sapiens 


Similar to EGF-like-domain, 
multiple 6 


3060 


100 


725 


gi6752658 


Homo sapiens 


AF1 86084_1 epidermal growth 
factor repeat containing protein 


3055 


99 


726 


gi 14530342 


Caenorhabditis 
elegans 




1008 


36 


726 


gi6531661 


Caenorhabditis 
elegans 


AF195610J LIN-41A 


1008 


36 


726 


gi6531663 


Caenorhabditis 
elegans 


AF1956 11 J LIN-41B 


1008 


36 


727 


gil504026 


Homo sapiens 




5833 


99 


727 


gi22725157 


Homo sapiens 


minor histocompatibility 
antigen HA-1 


5833 


99 


727 


gi23272016 


Homo sapiens 


Similar to PTPLl-associated 
RhoGAP 1 


5690 


98 


728 


gil3274120 


Homo sapiens 




1467 


99 


728 


gi6102996 


Mus musculus 


Vanin-3 


1018 


79 


728 


gi7160973 


Homo sapiens 


VNN3 protein 


1213 


96 


729 


gi27463365 


Homo sapiens 


a disintegrin-like and 
metalloprotease with 
thrombospondin type 1 motifs 
9B 


8961 


99 


729 


gi28804249 


Mus musculus 


metalloprotease-disintegrin 
protease 


4974 


55 


729 


gi9581879 


Homo sapiens 


AF261918_1 disintegrin 
metalloproteinase with 
thrombospondin repeats 


5723 


99 


730 


gi21063967 


Drosophila 
melanogaster 


AT05453p 


382 


31 


730 


gi5911409 


Drosophila 
melanogaster 


fuzzy 


382 


31 


730 


gi7297412 


Drosophila 
melanogaster 


CG13396-PA 


382 


31 


731 


gil5488017 


Homo sapiens 


AF407274_1 EWI2 


2302 


100 


731 


gi27497567 


Homo sapiens 


keratinocytes associated 
transmembrane protein 4 


2302 


100 
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ni i 


gi31 155155 


Homo sapiens 


Immunoglobulin superfamily, 
member 8 


2302 


100 


732 


gll54oo01 / 


Homo sapiens 


A O/l ATI **i A 1 T>11/T1 

AF407274 1 EWI2 


3200 


1 AA 
100 


732 


gi27497567 


Homo sapiens 


keratinocytes associated 
transmembrane protein 4 


3200 


100 


732 


gi3 1753233 


Homo sapiens 


Immunoglobulin superfamily, 
member 8 


3200 


100 


733 


gi22266726 


Homo sapiens 


AF311906_1 LIR-D1 
precursor 


1303 


96 


733 


gi27497567 


Homo sapiens 


keratinocytes associated 
transmembrane protein 4 


1303 


96 


733 


gi3 1753233 


Homo sapiens 


Immunoglobulin superfamily, 
member 8 


1303 


96 


734 


gi21748480 


Homo sapiens 


FU00271 protein 


605 


100 


734 


gi27497567 


Homo sapiens 


keratinocytes associated 
transmembrane protein 4 


513 


79 


734 


gi3 1753233 


Homo sapiens 


Immunoglobulin superfamily, 
member 8 


513 


79 


/jj 


gl314D34j / 


— : 

Homo sapiens 


putative NFkB activating 
protein 


JO J 


A A 

44 


71 <v 


r»i7A77Q19 


— : 

Homo sapiens 


unnamed protein product 


1704 

1 /y4 


00 i 
yy 


71 ^ 




Drosophila 
melanogaster 


lAj/3Z3-rA 


55y 


30 


/JO 


gi lzoU4 toy 


Homo sapiens 


A A WA7QA7 


1AQA 
J4V4 


07 

y 1 


71£ 


cril S77Q17S 


Homo sapiens 


/Vf\rii403z oimuar to 

\i\mrsthf>t\m) nmfpin T^CClCWQAO 
IiypUUlCUCal piUlClll DVyUu^7 < TA 


jD JZ 


Q7 
y 1 


736 


mi 808R919 

j^l 1 0U007J7 


Hnmn csni'pnc 




14Q4 


97 
y 1 


737 


gil2836469 


Mus musculus 


unnamed protein product 


3495 


87 


777 


m9^7S1 1 1 *\ 
glZOjJ lllJ 


Mus musculus 


unnamea protein proouci 


J400 


R7 
0 / 


m 


gi30721603 


Mus musculus 


RAVER 1 


3466 


87 


/JO 


rrii 7AA7AAA 


Homo sapiens 


Aruo i /3Z_ i iviyuzy protein 


/I 1 < 

41D 


1 AA 
1UU 


739 


gil5489209 


Mus musculus 


BC013712 protein 


266 


31 


739 


gi2 1757804 


Homo sapiens 


unnamed protein product 


1226 


96 


739 


gi26354220 


Mus musculus 


unnamed protein product 


1130 


79 


740 j 


• ] c*y At on/ 

gu5341806 


Homo sapiens 


a a in imi 

AAH13073 


A AAO 

2008 


100 


740 


gil9528077 


Drosophila 
melanogaster 


AT24025p 


165 


38 


740 


gi2 1627272 


Drosophila 
melanogaster 


CG12765-PA 


167 


24 


741 


glZ34y5Z23 


Plasmodium 
taiciparum 5u/ 


ABU14o34_oO liver stage 
antigen, putative 


407 


23 


"7/1 1 

/41 


gi3/4yzy4u 


Homo sapiens 


medulloblastoma antigen MU- 

IV/fR OA 7A1 
Md-ZU.ZUI 


DiO 


ZD 


741 


gi9916 


Plasmodium 

falciparum 


liver stage antigen 


393 


24 


74? 




T-4fvmtf\ conipno 


APH7917 1 r\rr\tr\rc*Ahf*rtr\ 11 
Arjj^i/^i pruiuvauncriii 11 


JjJH 


JO 


742 


gi 15054521 


Homo sapiens 


AF217288_1 protocadherin-S 


3362 


58 


742 


<gi9845485 


Homo sapiens 


AF169692_1 protocadherin-9 


6235 


100 


743 


gil6552038 


Homo sapiens 


unnamed protein product 


2404 


99 


743 


gi21410124 


Mus musculus 


3230402E02Rik protein 


1501 


61 


743 


gi5688958 


Homo sapiens 


PMMLP 


2405 


100 


744 


gi21734445 


Rattus norvegicus 


BMP/Retinoic acid-inducible 
neurai-specific protein-2 


3987 


94 


744 


gi2 1734447 


Rattus norvegicus 


BMP/Retinoic acid-inducible 


2948 


70 
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. . 

neural-specific protem-3 






lAA 
/44 


gljUJ4oolU 


Gallus gallus 


BMP/retinoic acid-inducible 
neural-specific protein 


2090 


52 


lA^ 
/4j 




Homo sapiens 


*7KTT7A 1 T 

ZfMryiL 


2077 


69 


745 


gi27693081 


Homo sapiens 




2054 


71 


^7A < 


gUU4zl2zo 


Homo sapiens 


zinc finger protein 430 


2486 


96 


/40 


gizi272o77 


Homo sapiens 


Similar to zinc finger protein 
208 


2472 


78 


/40 


glZozS 1 /5 J 


Homo sapiens 


ZNF431 protein 


2480 


79 


746 


gi30421228 


Homo sapiens 


zinc ringer protein 430 


3174 


100 


747 


gi 12 12965 


Homo sapiens 


transmembrane protein 


1010 


99 


74/ 


gtl213221 


Rattus norvegicus 


transmembrane protein 


1006 


98 


747 


gi 19683999 


Homo sapiens 


coated vesicle membrane 
protein 


1010 


99 


74o 


gill 99524 


Homo sapiens 


acid phosphatase 


2147 


95 


/4o 


'11111 <V7C 

gu31119/D 


Homo sapiens 


AAH03160 acid phosphatase 
2, lysosomal 


2143 


95 


748 


gi30584617 


synthetic construct 


Homo sapiens acid 
phosphatase 2, lysosomal 


2143 


95 


7AQ 


gliJOZD3 IK) 


— , 

Homo sapiens 


Ar4 1 1 9 o 1 1 centaunn beta 5 


O Of 1 

3851 


95 


74G 


glZ54ZZ /U4 


Homo sapiens 


CbN 1 135 protein 


2912 


100 


749 


gi30109272 


Homo sapiens 


CENTB5 protein 


4175 


99 




•iai c\ncA r \ 

gi 1 0197642 


Homo sapiens 


AF1 82421 1 MDS022 


647 


100 




1 COOA/1T7 

gl I j 929423 


Homo sapiens 


Hypothetical protein FU20502 


938 


100 


"7^ A 


gl3UZ//o9o 


Mus musculus 


D5Buc26e protein 


423 


78 


7^1 


guoOl4Uzo 


Homo sapiens 


zinc finger DNA binding 
protein p7 1 


998 


40 


/ J I 


rrJ77/C01C<CC 

giz/oyjojo 


— — : 

Homo sapiens 


zinc finger protein 398 


r\r\ o I 

998 I 


40 


751 


gi5630080 


Homo sapiens 


AC004890 2 


984 


36 




gll 1j4jjo2 


Homo sapiens 
— 


AF308801_1 vacuolar protein 
sorting protein 16 


3724 


95 


/ JZ 


gl 1Z l*tvZi7U 


Homo sapiens 




IT) A 

3 1 z4 


95 


7^7 

/ JZ 


gl 1 jJJJvHO 


Mus musculus 


vpsio 


iozo 


92 


753 


gi30141048 


Homo sapiens 


Nogo-66 receptor homolog-1 


2226 


100 




rrilAI/11 A^O 

gL>U14lLDZ 


Rattus norvegicus 


Nogo-66 receptor homolog-1 


2130 


95 


753 


gi32351287 


Rattus norvegicus 


Nogo-66 receptor homolog 2 


916 


51 


/04 


gil77o/0 


Homo sapiens 


alpha-2-macroglobulin 
precursor 


2718 


39 


/34 


glZj3UJ94o 


: 

Homo sapiens 


alpha-2-macroglobulin 


2718 


39 


/j4 


gD lyjyZ 


Homo sapiens 


alpha 2-macroglobulm 690-730 


2712 


39 1 


7<< 


rrl 1 OA/M ^A1 

glloU44jUI 


Mus musculus 


angiopoietin-like 3 


1692 


70 


755 


gi4929790 


Homo sapiens 


AF152562_1 angiopoietin- 
related protein 3 


2210 


93 


7^< 

/ J J 


gijoiyyy / 


Mus musculus 


AFlo2224_l angiopoietin- 
related protein 3 


1692 


70 


756 


ffi200057 


K/fliQ miicrnlitc 


iicurundi glycoprotein 


AS71 


87 
o/ 


756 


gi29837411 


Homo sapiens 


BIG-2 


3898 


69 j 


756 


gi563133 


Rattus norvegicus 


BIG-1 protein 


4778 


87 


757 


gil6550078 


Homo sapiens 


unnamed protein product 


3710 


99 


757 


gi28 175743 


Homo sapiens 


similar to hypothetical protein 
FLJ30803 


3714 


100 


757 


gi30354720 


Mus musculus 


AI427653 protein 


3609 


96 


758 


gi26329813 


Mus musculus 


unnamed protein product 


3627 


93 


758 


gi28 175743 


Homo sapiens 


similar to hypothetical protein 


3612 


98 
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FLJ30803 






758 


gi30354720 


Mus musculus 


A1427653 protein 


3520 


95 


759 


gi21929093 


Homo sapiens 


ocvcu u cuiaiiiciiiuninc neiLX 
receptor 


1718 
1 / 15 


CO 

oo 


759 


gi24286029 


Homo sapiens 


vj jjiuiciii uuupicu receptor 
GPR116 


A777 
Off/, 


oo 
yo 


759 


gi5525078 


Rattus norvegicus 


SPVPTl f"T*£>n c m PrnKrcsnA rononfnr 


DU-+5 


fl 


760 


gi 10440398 


Homo sapiens 


FT TflflfH? nrnfpiri 
ri^juviujz protein 




ol 


760 


gil 1917507 


Homo sapiens 


HPF1 protein 


1254 


62 


760 


gil5929737 


Mus musculus 

X TlUO IIIUJVUIUO 


biiiiiiiir to jviv/vrj zinc ringer 


1249 


CO 

58 


761 


gil3097633 


Homo sapiens 


AAH0^S^4 Similar tn ATPqco 

Class I, type 8B, member 1 




S3 


761 


gi33440008 


Homo saniens 


pudoiuic aminopnospnoupiQ 
translocase ATP8B2 


3473 


66 


761 


gi3628757 


Homo sapiens 


Fin 


2j /o 


53 


763 


gil 1558486 


Homo sapiens 


B-cell lymphoma/leukaemia 

1 1 A cViAr+ fnrm 

i l/v biior t iui in 


1314 


99 


763 


gil 8089267 


Homo saniens 


/V/vTlZlVi/O 


1 153 


100 


763 


gi30410854 


Mus musculus 




1312 


98 


764 


gi32394378 


Homo sapiens 


forkhead-associated domain 


1808 


100 


764 


gi32394380 


Bos taurus 


luiwicau-abiuuidLcu domain 
histidine-triad like protein 


lo3o 


on 

89 


764 


gi32394382 


Sus scrofa 


tux tuicdu-dbbucidica aomain 
histidine-triad like protein 


i /coi 


91 


765 


gi3 1455403 


Homo sapiens 


ay I aLdA.Hl 


Z41 


91 


765 


gi3 1455405 


Homo sapiens 


jmrfltQYiti 


23 D 


1 aa 

100 


765 


gi32394378 


Homo sapiens 


forkhead-associated domain 
iiiotiuiiic-triau iikc protein 


241 


97 


766 


gi3 1455403 


Homo sapiens 


QTwatavin 
ajJi auLAlu 


31o 


1 Art 

100 


766 


gi32394378 


Homo sapiens 


forkhead-associated domain 
lubiiuuic-tridu iiKe protein 


318 


100 


766 


gi32394382 


Sus scrofa 


forkhead-associated domain 
itioiiuuic-iiidu 11K.C protein 


307 


93 


767 


gi26454883 


Homo sapiens 


hypothetical protein HSPC148 


1181 


100 


767 


gi6523797 


Homo sapiens 


i iv/ /j i durcnai giano 

nrotein AD-007 


110 1 

1 lol 


1 AA 

100 


767 


gi6841518 


Homo sapiens 


AF161497 1 HSPC148 


1178 


99 


768 


gil4009597 


Homo saniens 


A P7 87 /a 1 Q 1 Ix/cwl nvi^rtna i ii^- 
Arzozoii/^i lysyi oxidase- li ice 

3 protein 


1816 


98 


768 


gi 14486600 


Homo sanipns 


Ar j 1 1 d i j_ i lysyi oxidase-iike 
j proiein 


1816 


98 


768 


gi 15 186770 


Homo sanipns 


ArzoHoij^i lysyi oxiaase-nice 

■\rntpi n 


1816 


98 


769 


gi22713410 


Homo sapiens 


GYLTL1B protein 


3229 


100 


769 


gi3954938 


Homo sapiens 


acetylglucosaminyltransferase- 
like protein 


2292 


70 


769 


gi3954978 


Mus musculus 


acetylglucosaminyltransferase- 
like protein 


2292 


70 


770 


gi7209721 


Mus musculus 


DD57 


2243 


88 


770 


gi7209723 


Homo sapiens 


WD-repeat like sequence 


2476 


99 


770 


gi82 17485 


Homo sapiens 




2473 


99 


771 


gil6552001 


Homo sapiens 


unnamed protein product 


3169 


100 
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111 


gil8676632 


Homo sapiens 


FLJ00215 protein 


1943 


99 


111 


gi21706685 


Mus musculus 


9630058J23Rik protein 


860 


59 


772 


gi 107991 66 


Homo sapiens 


AF305686_1 protein kinase 
Njmu-Rl 


1915 


99 


772 


gi32425794 


Homo sapiens 


NJMU-R1 protein 


1888 


100 


772 


gi32450708 


Homo sapiens 


NJMU-R1 protein 


1888 


100 


773 


gil3277972 


Mus musculus 


phosphatidate 
cytidylyltransferase 2 


2286 


96 


773 


gil9344052 


Homo sapiens 


... 


2376 


100 


773 


gi41 86023 


Homo sapiens 


CDS2 protein 


2376 


100 


774 


gil75 11840 


Homo sapiens 


AAH18769 


2251 


99 


774 


gi20988879 


Homo sapiens 


Similar to hypothetical gene 
supported by AL133057; 
BC018769; BC009436; 
AL133057; AL133057; 
AL133057 


2251 


99 


774 


gi29387317 


Mus musculus 


120001 1022Rik protein 


1792 


79 


775 


gi 13936996 


Human herpesvirus 8 


ORF73 


219 


21 


775 


gi2246532 


Human herpesvirus 8 


ORF 73, contains large 
complex repeat CR 73 


226 


19 


775 


gi30526291 


Saimiriine 
herpesvirus 2 


latency associated nuclear 
antigen 


219 


31 


776 


gil3477379 


Homo sapiens 


TTYH2 protein 


1037 


41 


776 


gil8676664 


Homo sapiens 


FLJ00231 protein 


1796 


91 


776 


gi28422735 


Xenopus laevis 




1054 


40 


111 


gil6877193 


Homo sapiens 


AAH 16860 G protein-coupled 
receptor, family C, group 5, 
member C 


939 


98 


111 


gi30583709 


Homo sapiens 


G protein-coupled receptor, 
family C, group 5, member C 


939 


98 


111 


gi8118032 


Homo sapiens 


AF207989_1 orphan G-protein 
coupled receptor 


939 


98 


778 


gi!5679980 


Homo sapiens 


CI 14 protein 


930 


99 


778 


gi 16769562 


Drosophila 
meianogaster 


LD38910p 


328 


47 . 


778 


gi7302978 


Drosophila 
meianogaster 


CG8441-PA 


328 


47 


779 


gil072675l 


Drosophila 
meianogaster 


CG13623-PA 


333 


53 


779 


gi2l430012 


Drosophila 
meianogaster 


GH27470p 


333 


53 


779 


gi7406400 


Arabidopsis thaliana 


putative protein 


317 


45 


780 


gil3959018 


Homo sapiens 


AF361746_1 endothelial cell- 
selective adhesion molecule 


902 


100 


780 


gil3991773 


Mus musculus 


AF361882_1 endothelial cell- 
selective adhesion molecule 


640 


70 


780 


gi29l65726 


Mus musculus 


Endothelial cell-selective 
adhesion molecule 


640 


70 


781 


gil5422171 


Homo sapiens 


22 kDa peroxisomal membrane 
protein 2 


1013 


100 


781 


gi297437 


Rattus norvegicus 


peroxisomal membrane protein 


795 


76 


781 


gi8l64184 


Homo sapiens 


22kDa peroxisomal membrane 
protein-like 


1013 


100 


782 


gi7620875 


Streptococcus 
pyogenes 


AF232324J Sicl.19 


203 


41 
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782 


gi7620883 


Streptococcus 
pyogenes 


AF232328J Sicl.23 


203 


39 


782 


gi7621271 


Streptococcus 
pyogenes 


AF232522J Sicl.217 


203 


39 


783 


gi62877 


Gallus gallus 


type VI collagen alpha-2 
subunit preprotein 


734 


42 


783 


©L62881 


Gallus gallus 


type VI collagen subunit 
alpha2 


734 


42 


783 


gi62882 


Gallus gallus 


type VI collagen subunit 
alpha2 


734 


42 


784 


gi 17945608 


Drosophiia 
melanogaster 


RE26969p 


829 


48 


784 


gi7292879 


Drosophiia 
melanogaster 


CG1998-PA 


829 


48 


784 


gi7292910 


Drosophiia 
melanogaster 


CG11162-PA 


597 


42 


785 


gi 17066 106 


Homo sapiens 


Novex-3 Titin Isoform 


8832 


99 


785 


gi2 123 8650 


Calotomus carolinus 


titin-like protein 


519 


62 


785 


gi27696390 


Xenopus laevis 


Similar to titin 


816 


48 


786 


gi 17979434 


Arabidopsis thaliana 


putative adenylate kinase 


193 


22 


786 


gi22 136756 


Arabidopsis thaliana 


putative adenylate kinase 


193 


22 


786 


gi30 180922 


Nitrosomonas 
europaea ATCC 

iyl lo 


Adenylate kinase 


201 


27 


/o / 




Macaca fascicularis 


— — : — : 

hypothetical protein 


117 


OS 

yo 


7RR 


gl loO/OOlvJ 


Homo sapiens 


tljuuzu4 protein 




7^ 


7851 
/oo 


m7<1CQ77^ 


Mus musculus 


unnamed protein product 


lion 


7£ 


788 


gi3002588 


Mus musculus 


Plenty of SH3s; POSH 


197 


24 


7QO 

Joy 


glloO/OOlU 


Homo sapiens 


M-J00ZU4 protein 




zo 


789 


gi26329287 


Mus musculus 


unnamed protein product 


1646 


75 


ion 

789 


gi26389725 


Mus musculus 


unnamed protein product 


1 £ A ZT 

1646 


75 


790 


gil2654107 


Homo sapiens 


AAH00866 


531 


88 


790 


gi 13937969 


Homo sapiens 


TIMPl protein 


531 


88 


790 


gil89382 


Homo sapiens 


collagenase inhibitor 


531 


88 


791 


gi24660226 


Homo sapiens 


C-type lectin-like receptor- 1 


1367 


90 


791 


gi7H02l6 


Homo sapiens 


AF200949_i C-type lectin-like 
receptor- 1 


1365 


90 


791 


gi71 10218 


Mus musculus 


AF201457_1 C-type lectin-like 
receptor 2 


312 


29 


792 


gil044l350 


Mus musculus 


olfactory UDP 
glucuronosyltransferase 


1557 


68 


792 


gi4753766 


Homo sapiens 


UDP glucuronosyltransferase 


1593 


67 


792 


gi5802604 


Cavia porcellus 


UDP glucuronosyltransferase 
UGT2A3 


1781 


72 


793 


gi 13325266 


Homo sapiens 


AAH04450 hypothetical 
proicin ivivj^zoDU 


888 


100 


793 


gi3688090 


Homo sapiens 


R32611 2 


796 


91 


793 


gi6841228 


Homo sapiens 


AF161407_1 HSPC289 


645 


77 


794 


gil5488645 


Mus musculus 


methyltransferase Cytl9 


1552 


76 


794 


gil8150409 


Rattus norvegicus 


AF393243 1 methyltransferase 


1518 


76 


794 


gi9963861 


Homo sapiens 


AF226730_1 Cytl9 


1729 


99 


795 


gi 11877243 


Homo sapiens 


SSF1/P2Y11 chimeric protein 


3802 


95 


795 


gi 14602631 


Homo sapiens 


Peter pan homolog 


2080 


99 


795 


gi21619996 


Homo sapiens 




2080 


99 
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796 


gi20330550 


Homo sapiens 


AF25 1706J NK inhibitory 
receptor precursor 


799 


98 


796 


gi30962593 


Homo sapiens 


AF375481_1 immune receptor 
expressed on myeloid cells 
splice variant 2 


800 


99 


796 


gi3 1790204 


Homo sapiens 


inhibitory receptor IREM1 


805 


99 


797 


gi20330550 


Homo sapiens 


AF251706_1 NK inhibitory 
receptor precursor 


799 


98 


797 


gi30962593 


Homo sapiens 


AF375481_1 immune receptor 
expressed on myeloid cells 
splice variant 2 


800 


99 


797 


gi3 1790204 


Homo sapiens 


inhibitory receptor IREM1 


805 


99 


798 


gi20330550 


Homo sapiens 


AF251706_1 NK inhibitory 
receptor precursor 


1480 


94 


798 


gi30962591 


Homo sapiens 


AF375480M immune receptor 
expressed on myeloid cells 
splice variant 1 


1401 


93 


798 


gi3 1790204 


Homo sapiens 


inhibitory receptor IREM1 


1478 


94 


799 


gil 830748 1 


Homo sapiens 


phosphoinositide-binding 
proteins 


2122 


100 


799 


gi27695704 


Mus musculus 


Connector enhancer of KSR2 


678 


36 


799 


gi29691916 


Rattus norvegicus 


interactor protein for cytohesin 
exchange factors 1 


1651 


79 


OA A 

800 


'1 i i Ann 

gi 11493982 


Homo sapiens 


AF208232_1 TLH29 protein 
precursor 


274 


72 


800 


gi 15929988 


Homo sapiens 


AAH15423 Similar to TLH29 
protein precursor 


424 


89 


800 


gi2 161 8549 


Homo sapiens 


TLH29 protein precursor 


274 


72 


801 


gill493982 


Homo sapiens 


AF208232J TLH29 protein 
precursor 


303 


70 


801 


gil5929988 


Homo sapiens 


AAH15423 Similar to TLH29 
protein precursor 


445 


100 


801 


gi21618549 


Homo sapiens 


TLH29 protein precursor 


303 


70 


802 


gi 12082723 


Gall us gallus 


AF293805_1 B cell 
phosphoinositide 3-kinase 
adaptor 


2825 


69 


802 


gi 12082725 


Mus musculus 


AF293806_1 B cell 
phosphoinositide 3-kinase 
adaptor 


3557 


84 


ou2 


• i OAOO Oil 

gil2082811 


Gallus gallus 


AF315784__1 B cell 
phosphoinositide 3-kinase 
adaptor 


11 OA 

2330 


73 


SA*J 




— — ; 

Homo sapiens 


Arlio/21 jj rKUlUoZ 


KA *T i 

04j 


1 C\(\ 

1UU 


Qf\A 

Ol>4 


gll.>.504o41 


Homo sapiens 


activating NK receptor 


10o4 


on 
yy 


8U4 


gl L jJo4o43 


Homo sapiens 


N l B-A receptor 


1 /UO 


100 


QCiA 

ol>4 


glyoo fvoy 


Mus musculus 


AF24eo35_l lymphocyte 

o nf i ctptx icrtfrtrm 1 
oilUgCll iUO loUiUHU 1 


£1C 


43 


805 


gil0177621 


Arabidopsis thaliana 


phytoene dehydrogenase-like 


195 


75 


805 


gil7979255 


Arabidopsis thaliana 


AT5g49550/K6M13 10 


211 


72 


805 


gi29028742 


Arabidopsis thaliana 


At5g49550/K6M13_10 


211 


72 


806 


gi 14270364 


Mus musculus 


Epigen protein 


378 


71 


806 


gi6272269 


Rattus norvegicus 


NCI protein 


122 


52 


806 


gi7799191 


Mus musculus 


tomoregulin-1 


122 


52 


807 


gi 14270364 


Mus musculus 


Epigen protein 


378 


71 


807 


gi6272269 


Rattus norvegicus 


NCI protein 


122 


52 
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xtSlLcUlagc 

Tffptifitv 


807 


gi7799191 


Mus musculus 


tomoregulin-1 


122 


lUvUU tj 

52 


808 


gi 14270364 


Mus musculus 


Epjgen protein 


378 


71 


808 


gi6272269 


Rattus norvegicus 


NCI protein 


122 


52 


808 


gi7799191 


Mus musculus 


tomoregulin-1 


122 


52 


809 


gi27469556 


Homo sapiens 


Putative neuronal cell adhesion 
molecule 


212 




809 


gi29289929 


Danio rerio 


neogenin 


185 


39 


809 


gi3068592 


Mus musculus 


punc 


198 


41 


810 


gi30348897 


Homo sapiens 


organic solute transporter beta 


643 


99 


810 


gi30348901 


Mus musculus 


organic solute transporter beta 


365 


62 


811 


gi 18650584 


Homo sapiens 


retinoic acid early transcript 1 


1070 


94 


811 


gil8650588 


Homo sapiens 


retinoic acid early transcript 1 


1124 


99 


811 


gi21961213 


Homo sapiens 


UL16 binding protein 2 


1070 


94 


812 


gil3872813 


Homo sapiens 


fibulin-6 


485 


30 


812 


gil4575679 


Homo sapiens 


AF156100 1 hemicentin 


485 


30 


812 


gi9280405 


Homo sapiens 


AF245505 1 adlican 


1372 


46 


813 


gil3872813 


Homo sapiens 


fibulin-6 


861 




813 


gi 14575679 


Homo sapiens 


AF156100 1 hemicentin 


857 


29 


813 


gi9280405 


Homo sapiens 


AF245505 I adlican 




is 


814 


gil3872813 


Homo sapiens 


fibulin-6 


OU 1 




814 


gi 14575679 


Homo sapiens 


AF156100 1 hemicentin 


857 


29 


814 


gi9280405 


Homo sapiens 


AF245505 1 adlican 


Z*t«DO 




815 


gi21619635 


Homo sapiens 


similar to Alu subfamily SQ 
seouence contami nation 
warning entry 


267 


60 


815 


gi3002527 


Homo sapiens 


neuronal thread protein AD7c- 
NTP 




69 i 


815 


gi6650810 


Homo sapiens 


AF1 18094 21 PRO1902 


261 


61 


816 


gi 12240284 


Mus musculus 


AF327059 1 apolipoprotein 
A5 


1300 


72 


816 


gi6707433 


Homo sapiens 


AF202889 1 apolipoprotein 
A5 


1864 


100 


816 


gi6707435 


Homo sapiens 


AF202890 1 apolipoprotein 
A5 


1864 


100 


817 


gi 12240284 


Mus musculus 


AF327059 1 aooliDonrntein 

A5 


i inn 


79 


817 


gi6707433 


Homo sapiens 


AF202889 1 aooHnonrotpin 

A5 


1864 


inn 

1UU 


817 


gi6707435 


Homo sapiens 


AF202890 1 apolipoprotein 
A5 


1864 


100 


818 


gil3H1784 


Homo sapiens 


AAH03081 hypothetical 
protein FLJ10637 


1720 


99 


818 


gi 13543037 


Mus musculus 


4933424B01Rik Drotein 


yoo 


fin 

oU 


818 


gil4249965 


Homo sapiens 


AAH08368 hvnothetical 
protein FLJ10637 




inn 


819 


gil9344001 


Homo sapiens 


phospholipase A2, group IID 


846 


99 


819 


gi5771420 


Homo sapiens 


AF1 12982 J group IID 
secretory phospholipase A2 


852 


100 


819 


gi6453793 


Homo sapiens 


AF1 88625 1 phospholipase 
A2 


846 


99 


820 


gi21751722 


Homo sapiens 


unnamed protein product 


688 


84 


820 


gi26342939 


Mus musculus 


unnamed protein product 


496 


59 


821 


gil 1094019 


Homo sapiens 


AF305057 2 RTS beta 


2116 


96 


821 


gil 150421 


Homo sapiens 


rTSbeta 


2122 


96 
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gi 12654883 


Homo sapiens 


A A TTA 1 AOr rrtA t a * 

AAH01285 rTS beta protein 


2122 


96 


822 


gil2803167 


Homo sapiens 


AAH02387 nucleosome 
assembly protein 1-like 1 


1728 


99 


822 


gil89067 


Homo sapiens 


NAP . 


1728 


99 


822 


gi30582885 


Homo sapiens 


nucleosome assembly protein 
1-like 2 


1728 


99 


823 


gil3432042 


Homo sapiens 


integrin-linked kinase- 
associated serine/threonine 
phosphatase 2C 


2009 




823 


gil6306907 


Homo sapiens 


AAH06576 integrin-linked 
kinase-associated 
serine/threonine phosphatase 
2C 


2009 


99 


823 


gi20072498 


Mus musculus 


0710007A14Rik protein 


1926 


94 


824 


gi28175169 


Mus musculus 


13000 15B04Rik protein 


835 


73 


oz4 


gi28848867 


Homo sapiens 


URG11 


1164 


100 


824 


gi7768636 


Xenopus laevis 


Kielin 


239 


36 


825 


gi2 1928259 


Homo sapiens 


seven transmembrane helix 
receptor 


1023 


100 


825 


gi21928496 


Homo sapiens 


seven transmembrane helix 
receptor 


1023 


100 


825 


gi2 1928655 


Homo sapiens 


seven transmembrane helix 
receptor 


916 


89 


826 


gi 18480746 


Mus musculus 


olfactory receptor MOR261-10 


1278 


79 


826 


gi2 1928655 


Homo sapiens 


seven transmembrane helix 
receptor 


1456 


93 


826 


gi32052225 


Mus musculus 


olfactory receptor 

GA x6K02T2P3E9-4341246- 

4340281 


1278 


79 


827 


gi4760780 


Mus musculus 


Ten-m3 


364 


95 


527 


gi53 07761 


Danio rerio 


ten-m3 


310 


78 


oil 


gio 7603 69 


Mus musculus 


AF195418 1 ODZ3 


364 


95 


828 


gU6265938 


Homo sapiens 


AF314817_1 FKSG15 


2437 


98 


828 


gi2l205852 


Homo sapiens 


AF385429_1 T-cell activation 
Rho GTPase activating protein; 
TA-GAP 


3756 


100 


828 


gi21205854 


Homo sapiens 


AF385430_1 T-cell activation 
Rho GTPase activating protein 
splice variant 1; TA-GAP 


2850 


100 


829 


gi 10432396 


Homo sapiens 




383 


62 


829 


gi30908443 


Homo sapiens 


CUB and sushi multiple 
domains 2 


388 


63 


829 


'T Af\AO A AC 

gi30908445 


Homo sapiens 


CUB and sushi multiple 
domains 3 


549 


100 


830 


gil0432396 


Homo sapiens 




383 


62 ; 


830 


gi30908443 


Homo sapiens 


CUB and sushi multiple 
domains 2 


388 


63 


830 


gi30908445 


Homo sapiens 


CUB and sushi multiple 
dpmains 3 


549 


100 


831 


gi3342148 


Chlamydomonas 
reinhardtii 


myosin heavy chain 


499 


37 


831 


gi532124 


Dictyostelium 
discoideum 


myosin IC 


517 


41 


831 


gi8953751 


Arabidopsis thaliana 


myosin heavy chain MYA2 


492 


41 


832 


gi6472600 


Chara corallina 


unconventional myosin heavy 


621 


38 
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UcalTipiIOU 




PprrentflPA 
Identity 








chain 






O JZr 






JiiyUoiil liCdv y iti a JlVi* 


621 


38 


819 


gl^T-J JO J J 


v^iicua lAJiaiiiiio. 


m vac in 


621 


38 




glZ 1ZOJ IOJ 


WrtrtiA com pn o 




2424 


99 


834 


gi7248845 


Homo sapiens 


AF231124 1 testican-i 


2428 


99 


oo4 


rrlHQIQA ^ 

gl/ioo4D 


Homo sapiens 


testican 


9498 

ZtZO 




835 


gi20380774 


Homo sapiens 




2930 


99 


835 


gi22761091 


Homo sapiens 


unnamed protein product 


ZojU 


00 


835 


gi27502762 


Mus musculus 


hypothetical protein 
MUCZoyi 1 


2712 


90 


836 


gi20380774 


Homo sapiens 




2946 


100 


836 


gi22761091 


Homo sapiens 


unnamed protein product 


2366 


100 


836 


gi27502762 


Mus musculus 


hypothetical protein 
MuCzoVi 1 


o 

2728 


A1 

91 


837 


gi 1739 1348 


. 

Homo sapiens 


AAriioolD oimiiar to oram 
expressed, A-imKea i 


004 




on 


gi /ooyUzy 


— — : 

Homo sapiens 


Arzzuioy_i uncnaracierizeQ 
hypothalamus protein HBEX2 


QO*r 


inn 


on 

837 


gi9963771 


Homo sapiens 


At i o j4 i o_ i ovanan granulosa 
cell 13.0 kDa protein hGR74 
nomoiog 


AAA 


inn 


OJO 


gllDZl J 1ZZ 


__ 

Mus musculus 


cnonaroaanenn 


A98 

HZO 


D 1 






Mus musculus 


j4jU**-z /in 1 1 kjk protein 


din 


97 1 
z / 


838 


gi30908853 


Homo sapiens 


synleurin 


3201 


100 




* 1 ^ 0/11/1 /CC 

gllZo4Z40D 


Mus musculus 


unnamed protein product 


jOf 


09 

yz 


839 


gi 15488920 


Homo sapiens 


AAH13587 Similar to RIKEN 
cuina zu iu iu f\j/,j gene 


632 


100 


839 


gi 19354289 


Mus musculus 


RIKEN cDNA 2010107G23 
gene 


567 


92 


840 


gi 16549697 


Homo sapiens 


unnamed protein product 


2483 


99 


840 


gi20988071 


Mus musculus 


260001 lE07Rik protein 


919 


80 


840 


gi216l9776 


Homo sapiens 


similar to KlKJfciN cDNA 
260001 1E07 gene 


1AQA 
245*1 


1 AA 


841 


gi 12963 869 


Mus musculus 


gene trap ankyrin repeat 
containing protein 


ZZ3 




841 


gi28565117 


Drosophila 
melanogaster 


myosin phosphatase UMbb-b 


ZZ5 


zz 


841 


' Tf\i <)o^/c 

gi30l38665 


Nitrosomonas 
europaea ATCC 

1 Q*71 Q 


Ankyrin-repeat 


ZZo 


31 


842 


gi 12408272 


Homo sapiens 


apolipoprotein L-IV splice 
variant a 


1742 


100 


842 


gi 12408286 


Homo sapiens 


apolipoprotein L-IV splice 
variant a 


1742 


100 


842 


gi 13374351 


Homo sapiens 


AF305226_l apolipoprotein 

T A 

L4 


1725 


99 


QA 1 

o4J 


gliZ4UoZ/Z 


— : — 

Homo sapiens 


nnAiinAnrAlam 1 T \ / finliA^ 

apoupoproiein ju-i v spnce 
variant a 


I ID 1 


QO 


843 


gi 12408286 


Homo sapiens 


apolipoprotein L-IV splice 
variant a 


1737 


99 


843 


gi 13374351 


Homo sapiens 


AF305226 1 apolipoprotein 
L4 


1720 


99 


844 


gi21744725 


Homo sapiens 


AF478693J glycosyl- 
phosphatidyl-inositol-MAM 


2296 


100 


844 


gi25005318 


Sus scrofa 


MAM domain containing 


1804 


93 i 
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giycosyipnospnaiiayi inositol 
anchor 1 






844 


gi25005320 


Sus scrofa 


gl ycosyipnospnauGyi inositol 
anchor 1 protein 




92 


845 


©21744725 


Homo sapiens 


Ar4/ooy3_l giycosyi- 
phosphatidyl-inositol-MAM 




100 

IVU 


845 


gi25005318 


Sus scrofa 


MAM domain containing ! 
glycosylphosphatidylinositol 
anchor 1 


4481 


95 


845 


gi25005320 


Sus scrofa 


glycosylphosphatidylinositol 
ancnor i proiein 


Al so 




846 


gil066493 


Saccharomyces 
cerevisiae 


Yprl44cp 




10 


846 


gi32487557 


Oryza sativa 
(japonica cultivar- 
group) 


Uo J in tsauu i j is. l o. y 


565 


32 


846 


gl4UU/ /Do 


o cnizosacc narom yc e 
s pomoc 


^PRn604 06r 


613 


33 


847 


gil4280050 


Homo sapiens 


Vps39/Vam6-like protein 


3913 


88 


0/1*7 

54/ 


gU4/Ui /Oo 


riomo sapiens 


Vnm6/Vn<;19-1iVe nrotein 


3990 


89 


84/ 




riomo sapiens 




4079 


98 


QA O 

o4o 




riomo sapiens 




4095 


99 


848 


gi25059032 


Mus musculus 




3128 


72 


O A O 

848 


^"iClACI A A1 

gl294o /44z 


Homo sapiens 


delta 


1512 


41 


849 


gi 14603301 


Homo sapiens 


iiypotneticdi proLcm r l*j 11/ 


986 


100 


849 


gi7291437 


Drosophila 
melanogaster 


PfM.071-PA 


510 


49 


849 


giiJlOoMi 


Arabidopsis thaliana 


putative pro ic in 


340 


36 


850 . 


gi 13 161409 


Mus musculus 


family 4 cytochrome P450 


444 


73 


850 


gi 13 182964 


Mus musculus 


l\rz.jjD*r3 i cyiocnromc r»tju 


196 


38 


850 


gi 13278244 


Mus musculus 


cytochrome P450, family 4, 

suuituniiy t, puiyucpiiuc u 


196 


38 


851 


gi 10944887 


Homo sapiens 


FGFR-like protein 


2475 


98 


851 


gil3 183618 


Homo sapiens 


AE"*17£7fl 1 PflP Vir\mr\l ntrrkiio 

/vrj izo /o__i rvjr nomoiogoub 
lacior receptor 


7474. 


97 


851 


gi 13447749 


Homo sapiens 


AF279689J fibroblast growth 
iacior receptor -> 


2475 


98 


852 


gi 10944887 


Homo sapiens 


FGFR-like protein 


2701 


99 


852 


•ill ZT 1 O 

gil3l836l8 


Homo sapiens 


Ar 3 1 zo I o_ 1 r ur nomoiogous 
factor receptor 




98 


852 


gi 13447749 


Homo sapiens 


Arz /yooy__i nDrooiasi growxn 
xactor recepior j 


77ft 1 


99 


853 


gi 10944887 


Homo sapiens 


U/^lUD lily a nrntoi n 

rUrriv-iiKe protein 


5Jtt 

JOJ 


98 


853 


gil3183618 


Homo sapiens 


AF3 12678J FGF homologous 
factor receptor 


583 


98 


853 


gi 13447749 


Homo sapiens 


AF279689_1 fibroblast growth 
factor receptor 5 


583 


98 


854 


gil2667446 


Rattus norvegicus 


AF336854 1 synaptotagmin 
VIIs 


2034 


95 


854 


gi6136786 


Mus musculus 


synaptotagmin VII 


2025 


95 


854 


gi643656 


Rattus norvegicus 


synaptotagmin VII 


2034 


95 


855 


gi 12053709 


Homo sapiens 


with thrombospondin type 1 
motif; 12 


8842 


100 
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855 


ei278 17773 


Mus musculus 


metal lonroteaQe disintparin 19 

protein 


7094 


80 


855 


ei5923788 


Homo ^aniens 


AF 140675 1 7inc 
metalloprotease ADAMTS7 


2471 


51 


856 


ffil5929988 


Homo saniens 


AAH 15423 Similar to TLH29 
protein precursor 


179 


48 


857 


gi 13542874 


Mus musculus 


Similar to RIKEN cDNA 
2210412D01 


1301 


74 


857 


ei 17391206 


L*IUO lILUauUtUO 


RIKEN cDNA 2210412D01 


1591 


94 


857 


gi28277574 


Danio rerio 


Similar to RIKEN cDNA 
2210412D01 eene 


1377 


79 


858 


ei!3542874 


Mus musculus 

IT J.UJ UlUOVUlUi} 


Similar to RIKEN cDNA 
2210412D01 


1301 


72 


858 


gil7391206 


Mus musculus 


RIKEN cDNA 2210412D01 


1591 


94 


858 


gi28277574 


Danio rerio 


Similar to RIKEN cDNA 
2210412D01 gene 


1343 


79 


859 


gi20071312 


Mus musculus 


4933425F03Rik protein 


1219 


80 


859 


gi2l7732 


Oryctolagus 
cuni cuius 


macrophage scavenger receptor 
type I subunit 


602 


38 


859 


gi33391740 


Homo sapiens 


MGC45780 


1521 


98 


860 


gi20071312 


Mus musculus 


4933425F03Rik protein 


1321 


86 


860 


gi33391740 


Homo sapiens 


MGC45780 


1656 


87 


860 


gi6478784 


Mus musculus 


scavenger receptor type A SR- 
A 


679 


34 


861 


gi 1 1493463 


Homo sapiens 


AF130117 38 PR02852 


298 


75 


861 


gi2l748687 


Homo sapiens 


unnamed protein product 


319 


72 


861 


gi28801453 


Homo sapiens 


unnamed protein product 


325 


77 


862 


gil4456629 


Homo sapiens 




.1232 


50 


862 


gil5081398 


Homo sapiens 


AF395541 1 kruppel-like zinc 
finger protein 


• 1245 


54 


862 


gi29476835 


Homo sapiens 




1222 


47 


863 


gil6551721 


Homo sapiens 


unnamed protein product 


3124 


99 


863 


gi2 1 320872 


Mus musculus 


Coe8 


2744 


87 


863 


ei7297851 


Drosophila 
melanogaster 


CG6488-PA 


1143 


43 


864 


gi 16307258 


Homo sapiens 


AAH09717 hypothetical 
protein 


942 


100 


864 


gi22945521 


Drosophila 
melanogaster 


CG31922-PA 


165 


33 


864 


gi7242597 


Homo sapiens 


hypothetical protein 


942 


100 


865 


gi23274241 


Homo sapiens 


KIAA1892-like 


2039 


86 


865 


gi26332H4 


Mus musculus 


unnamed protein product 


1964 


82 


865 


gi26345386 


Mus musculus 


unnamed orotein nroduct 


1964 


82 


866 


eil5620885 


Homo saniens 


KIAA1913 Drotein 


2495 


100 


866 


gi26339494 


Mus musculus 


unnamed protein product 


2312 


90 


866 


gi28279830 


Homo sapiens 


KIAA1913 protein 


2495 


100 


867 


gil000448 


Rattus norvegicus 


Rat kidney AGT2 precursor 


2202 


81 


867 


gil2406973 


Homo sapiens 


alanine-glyoxylate 
aminotransferase 2 


2740 


100 


867 


gil944136 


Rattus norvegicus 


beta-alanine-pyruvate 
aminotransferase 


2249 


83 


868 


^gi 1000448 


Rattus norvegicus 


Rat kidney AGT2 precursor 


1583 


84 


868 


gi 12406973 


Homo sapiens 


alanine-glyoxylate 
aminotransferase 2 


1870 


98 
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868 


gil944136 


Rattus norvegicus 


beta-alanine-pyruvate 
aminotransferase 


1630 


86 


869 


gi26892205 


Homo sapiens 


1 


448 


39 


869 


gi29436673 


Mus rausculus 


1700049K14Rik protein 


1732 


99 


869 


gi4165315 


Sus scrofa 


kallikrein 


452 


41 


870 


gil7985046 


Brucella melitensis 
16M 


GLYCOS YL TRANSFERASE 


130 


28 


870 


gi20515259 


Thermoanaerobacter 
tengcongensis 


predicted glycosyltransferases 


133 


32 


870 


gi4455730 


Streptomyces 
coelicolor A3(2) 


putative transferase 


140 


32 


872 


gil3649477 


Homo sapiens 


AF250309_1 putative cytokine 
receptor CRL4 precusor 


1998 


100 


872 


gi30584223 


synthetic construct 


Homo sapiens interleukin 17B 
receptor 


1998 


100 


872 


gi8705222 


Homo sapiens 


AF212365J IL-17B receptor 


1998 


100 


873 


gil8676472 


Homo sapiens 


FLJ00133 protein 


6475 


100 


873 


gi20379832 


Homo sapiens 


FLJ00133 protein 


3072 


94 


873 


gi29568116 


Mus musculus 


secreted nrotein SST3 


3973 


84 


875 


gi 14249936 


Homo sapiens 


AAH08349 Similar to S- 
adenosylhomocysteine 
hydrolase-like 1 


2581 


100 


875 


gil6588687 


Homo sapiens 


AF315687J S- 
adenosylhomocysteine 
hydrolase-like protein 


2429 


92 


875 


gi27692283 


Mus musculus 


S-adenosylhomocysteine 
hydrolase-like I 


2429 


92 


876 


gil4279990 


Homo sapiens 


AF294842_1 ubiquitin UBF-fl 


458 


100 


876 


gi29791813 


Homo sapiens 


Ubi qui tin-conjugating enzyme 
E2C, isoform 1 


212 


74 


876 


gi30583439 


Homo sapiens 


ubiquitin-conjugating enzyme 
E2C 


212 


74 


877 


gi20086516 


Homo sapiens 


AF245303J prominin-2 
variant A 


4241 


99 


877 


gi20086518 


Homo sapiens 


AF245304J prominin-2 
variant B 


4241 


99 


877 


gi24637566 


Rattus norvegicus 


prominin-2 


3224 


74 


878 


gi29351676 


Homo sapiens 


Angiopoietin-like 5 


2104 


100 | 


878 


gi29468510 


Homo sapiens 


putative fibrinogen-like protein 


2099 


99 


878 


gi29791750 


Homo sapiens 


angiopoietin-like 1 


392 


37 


879 


gi29351676 


Homo sapiens 


Angiopoietin-like 5 


2100 


99 


879 


gi29468510 


Homo sapiens 


putative fibrinogen-like protein 


2095 


99 


879 


gi29791750 


Homo sapiens 


angiopoietin-like 1 


392 


37 


880 


gi29351676 


Homo sapiens 


Angiopoietin-like 5 


2100 


99 


880 


gi29468510 


Homo sapiens 


putative fibrinogen-like protein 


2095 


99 


880 


gi29791750 


Homo sapiens 


angiopoietin-like 1 


392 


37 


881 


gil 1493483 


Homo sapiens 


AF130117J8 PRO2550 


319 


66 


881 


gi!872200 


Homo sapiens 


alternatively spliced product 
using exon 13A 


303 


56 


881 


gi7770139 


Homo sapiens 


AF1 19917 13 PR01722 


318 


69 


882 


gi 13543706 


Homo sapiens 


AAH06003 


349 


100 


882 


gi20988061 


Mus musculus 


18100l3D10Rik protein 


333 


92 


882 


gi21619079 


Homo sapiens 




349 


100 


883 


gil 1493652 


Homo sapiens 


AF200708 1 calcium channel 


2552 


100 
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blocker resistance protein 








gl 13924720 


: 

Homo sapiens 


ArzDzo//_l cystine/glutamate 
transporter xCT 


2552 


1 AA 

100 


883 


gi 15082352 


Homo sapiens 


AAH12087 member 11 


2552 


100 


OO A 

884 


gi 14252988 


Homo sapiens 


SRPKla protein kinase 


2297 


86 


OO A 

884 


gi23468345 


Homo sapiens 


5rKa protein kinase 1 


2304 


on 

87 


884 


gi507213 


Homo sapiens 


serine kinase 


2297 


86 


885 


' t rt f\ A a*\ rn 

gi 18044358 


Homo sapiens 


AAH19883 Similar to lectm- 
like NK cell receptor 


270 


57 


885 


gi9837288 


Homo sapiens 


C-type lectin 


270 


57 


885 


gi9837292 


Homo sapiens 


C-type lectin 


270 


57 


886 


gi22 164066 


Homo sapiens 


AF388385_1 neuroblastoma- 
amplified protem 


7571 


99 


886 


gi30353863 


Homo sapiens 


NAG protein 


7227 


99 


886 


gi4337460 


Homo sapiens 


neuroblastoma-amplified 
protein 


6886 


99 


887 


gi22 164066 


Homo sapiens 


AF388385_1 neuroblastoma- 
amplified protein 


7309 


96 


887 


gi30353863 


Homo sapiens 


NAG protein 


6965 


96 


887 


gi4337460 


Homo sapiens 


neuroblastoma-amplified 
protein 


6624 


96 


888 


gil8645094 


uncultured 
proteobacterium 


M20/M25/M40 family 
peptidase, putative 


383 


38 


ooo 

888 


gi 19387947 


Mus musculus 


LOC2 12933 protein 


CIA 

510 


73 


888 


gi28806353 


Vibrio 

parahaemolyticus 


putative M20/M25/M40 family 
peptidase 


O 0*7 

387 


35 


©on 

osy 


gl 1 155o029 


Homo sapiens 


organic cation transporter 


1 QC*7 

loot 


99 


ooo 
889 


gi 18088251 


Homo sapiens 


AAH20565 Similar to hBOIT 
for potent brain type organic 
ion transporter 




95 


QQA 


gl90DJl 17 


Homo sapiens 


organic cation transporter 




no 

yy 


890 


gi21732438 


Homo sapiens 


hypothetical protein 


977 


100 


890 


gi26330392 


Mus musculus 


unnamed protein product 


765 


80 


890 


gi2639021l 


Mus musculus 


unnamed protein product 


765 


OA 

80 


891 


gil3375149 


Homo sapiens 




853 


90 


891 


gi20072584 


Mus musculus 


cDNA sequence BC027127 


259 


37 


891 


gi7259265 


Mus musculus 


region 


277 


47 


892 


gil6589003 


Homo sapiens 


AF386649_1 bromodomain- 
containing 4 


6353 


99 


892 


gil8308125 


Mus musculus 


AF461395__1 bromodomain- 
containing protein BRD4 long 
variant 


5992 


92 


892 


gi9931486 


Mus musculus 


AF273217_1 cell proliferation 
related protein CAP 


5994 


AA 

92 


SIQ1 


gl 1 JHZUoZo 


Homo sapiens 


Arj7/J7^ 1 lNlJ-CO-1 




OQ 

yy 


893 


gil9386926 


Rattus norvegicus 


AF442822_1 optimedin form B 


2484 


98 


893 


gil9386930 


Mus musculus 


AF442824_1 optimedin form B 


2484 


98 


894 


gi22209078 


Homo sapiens 


hypothetical protein 
DKFZp566D234 


4474 


99 


894 


gi26337809 


Mus musculus 


unnamed protein product 


4135 


91 


894 


gi6330966 


Homo sapiens 


KIAA1263 protein 


4492 


100 


895 


gi 12654031 


Homo sapiens 


AAH008 19 Similar to CG6950 
gene product 


1538 


99 
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SEQ ID 


Hit ID 




uesenpnon 


Sjscore 


Percentage 
Identity 


895 


ei5002565 


TalfifiiPit ruhrinpQ 

X OJVlJLUgU 1 LLU1 ip&O 


cysteine conjugate beta-lyase 


1235 


55 


895 


gi758591 


Homo sapiens 


glutamine-phenylpyruvate 
ami n otrans f erase 


1193 


51 


896 


ei!4017833 




VTA A 1 QAQ ^ r -^+, 1 * 

jviaaibuo protein 


2905 


99 


896 


gi21666433 


Mus musculus 


AF404775J actin-binding 
L.11Y1 proiein 1 meoium isoiorm 


1498 


60 


896 


gi30259308 


Mus musculus 


actin-binding LIM protein 2 


2799 


86 


897 




1? affile YM~\r\rf*rn piu 


protein serine/threonine kinase 


818 


52 


897 


ei6716518 


Mii<5 mu^ciiliiQ 

I'AUo lllUOvUlUO 


/vr 1 j j i aouDiecomn-iiKe 
kinase 


818 


52 


897 


ei67 16522 


i Villa lilUbwUlUb 


Arljjozl 1 (JrGlo 


818 


52 


898 


gi2062399 


Rattus noryegicus 


protein serine/threonine kinase 


818 


52 


898 


gi6716518 


Mus musculus 


AF1551 doubleccMtin-like 
kinase 


818 


52 


898 




N/1110 mitcr*iiliir 
iviUb lilUSCUlUS 


AMj5o21 l CPGlo 


818 


52 


899 


gi 13436035 


Mus musculus 


prostaglandin E synthase 2 


1583 


83 


899 


gi29 179467 


Danio rerio 


Similar to prostaglandin E 
synthase 2 


1079 


60 


899 


gi9280108 


Macaca fascicularis 


membrane-associated 
prostaglandin E synthase-2 


1907 


97 


900 




— 

Mus musculus 


Complement component l, q 
subcomponent, alpha 
polypeptide 


945 


70 


900 


ci20988805 


nuiiiu bd.picns 


complement component I, q 
subcomponent, alpha 
polypeptide 


1308 


99 


900 


ei4894854 


Hortirt ennipne 


Ar complement C 1 q 
/\ vnain precursor 


1308 


99 


901 


gil2841760 


Mus musculus 


unnamed protein product 


928 


80 


901 


ei 128468 17 


A/fiic miicpnlnc 

IVJLUo iliUoUUlUo 


unnamed protein product 


931 


80 


901 


£13080^090 


WfMTIrt OQ Ml PMC 


oimuar to KiKbN cDN A 
1810059G22gene 


1127 


100 


902 ! 


ei2 1707458 


Wnmr» canipnc 


PAX transcription activation 
domain interacting protein 1 
iiKe 


2704 


87 


902 


ei2565046 


J-Tom r\ cam' f^n c 
livjuiu oapiCIlb 


a ntryQ 
CAUrZo 


3771 


97 


902 


ei43 36734 


lV/Ti 10 miicrnlnc 

1V1UO IllUdVUlUO 


Pax transcription activation 
domain interacting protein r i IP 


A 1 t C 

4115 


77 


903 


Eil4164561 




Ar 1 / Zojj 1 a Witt 


467 


79 


903 


gi4336734 


Mus musculus 


Pax transcription activation 
domain interacting protein PTIP 


531 


93 


904 


ei 15929776 


HVimn ennipne 
•I lUIUU oapiciib 


AAjiiDJuy growth suppressor 
1 


135 


41 


904 


gi23271416 


Mus musculus 


Leprel protein 


135 


41 


904 


gi30582917 


Homo sapiens 


1 




A\ 
*rl 


905 


gi2443352 


Mus musculus 


platelet glycoprotein lb beta 1 


149 


45 


905 


gi30908853 


Homo sapiens 


synleurin 


1549 


100 


905 


gi6808603 


Homo sapiens 


AF169675J leucine-rich 
repeat transmembrane protein 
FLRT1 


147 


40 


906 


gil3991167 


Homo sapiens 


sialic acid-binding 
immunoglobulin-like lectin-like 
long splice variant 


1174 


100 
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SEQJD 
906 


I Hit JD 

gi 14625822 


Species 
Homo sapiens 


Description 

AF282256 1 Siglec-Ll 


S_score 
1174 


Percentage^ 

Tripntitv 

100 


906 
907 


gi23272769 
gil3435476 


Homo sapiens 
Mus musculus 


SIGLEC-like 1 

DNA segment, Chr 10, 
University of California at Los 
Angeles 1 


1174 
900 


lvU 

yj 


907 


gi28279553 


Danio rerio 


Similar to DNA segment, Chr 
10, University of California at 
Los Angeles 1 


750 


87 


907 


gi29144983 


Mus musculus 


DNA segment, Chr 6, ERATO 
Doi 253, expressed 


657 


67 


908 


gi 1504040 


Homo sapiens 




4470 


56 


908 


gi6273399 


Homo sapiens 


AF200348J melanoma- 
associated antigen MG50 


4470 


56 


908 
909 


gi7292259 
gi 1504040 


Drosophila 
melanogaster 
Homo sapiens 


CG12002-PA 


2536 


36 


909 


gi6273399 


Homo sapiens 


AF200348J melanoma- 
associated antigen MG50 


4470 
4470 


56 
56 


909 
910 


gi7292259 
ei 15 04040 


Drosophila 
melanogaster 


CG12002-PA 


2536 


36 


910 


gi6273399 


numu bdpicns 

Homo sapiens 


AF200348_1 melanoma- 
associated antigen MG50 


4112 
4112 


56 
56 


910 
911 


gi7292259 
gil8175295 


Drosophila 
melanogaster 
Homo sapiens 


CG12002-PA 


2388 


36 


911 
911 


gil8182323 
gi29 144951 


Mus musculus 
Mus musculus 


CRB1 isoform 11 precursor 
AF406641J crumbs-like 
protein 1 precursor 


1258 
1242 


28 
29 


912 


gi 11493463 


Homo sapiens 


5930402A21 protein 
AF130117 38 PR02852 


4084 
173 


72 
54 


912 


gi2l 104464 


Homo sapiens 


OK/SW-CL.41 


184 


61 


912 | 


gi6650802 


Homo sapiens 


API 18094 17 PR01848 


200 


56 


913 


gi6808611 | 


Homo sapiens 


AF204231J 88-kDaGolgi 
protein 


3237 


99 


913 


gi6969980 


Homo sapiens 


AF163441 1 golgin67 


2345 


98 


913 ! 


gi7211438 


Homo sapiens 


AF164622 1 golgin-67 


2327 


98 


914 


gil5030299 ' 


Mus musculus 


protein kinase, cAMP 
dependent regulatory, type I 
beta 


1881 


94 


914 j 


gi200365 


Mus musculus 


cAMP-dependent protein 
kinase regulatory subunit 


1886 


94 


914 

915 | 
915 


gi307377 

gil4017915 
gi7022002 


Homo sapiens 

Homo sapiens 
Homo sapiens 


cAMP-dependent protein 
kinase Ri-beta regulatory 
subunit 

KIAA1849 protein 
unnamed protein product 


1957 

3460 
3074 


99 

100 
100 


915 
916 
916 

916 


gi7022284 
ei 1845577 
gi30047223 

gi3645913 


Homo sapiens 
ivius musculus 
Mus musculus 

Mus musculus 


unnamed protein product 
-lipoxygenase 

Arachidonate lipoxygenase, 
epidermal 


3460 
2619 
2617 


100 

77 

77 


917 j 

917 i 
917 | 


gil5489302 

gil845577 
gi30047223 


Mus musculus 

Mus musculus 
Mus musculus 


-lipoxygenase 
arachidonate lipoxygenase, 
epidermal 
-lipoxygenase 

Arachidonate lipoxygenase. 


2619 
1142 

1139 
1142 


77 
69 

69 
69 
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illl 1U 




Dpscrintion 


Sjscore 


Percentage^ 
Identity 








epidermal 






018 


cnl 5480109 


Mii<5 mu^culii*? 


arachidonate lipoxygenase, 
epidermal 


1263 


75 


01 9 

7lO 




Mus muse ul us 


-lipoxygenase 


1260 


75 


918 


gi30047223 


Mus musculus 


Arachidonate lipoxygenase, 
epidermal 


1263 


75 


919 


ei 12053299 


Homo saDiens 


hypothetical protein 


2183 


100 


919 


#22478033 


Homo sapiens 


hypothetical protein FLJ22944 


3409 


91 




1*19904^61 9 


XJi VJoULJllila 

m pi an 0 ouster 


CG31652-PA 


131 


23 


920 


gil4198207 


Mus musculus 


hypothetical protein BC008163 


1599 


98 




m 1 Q1416Q9 






1625 


100 




m790AOA < ; 


T*\»v\c«/\r^ hi 1 0 

jjiosopniid. 
melanogaster 


rn4452-PA 


615 


40 


Q91 


m91 SQ4Q81 


H/Mnn cor*" pne 


rvtokine-like nrotein C17 


238 


74 ] 




m£1 19681 
glo 1 jaOOj 


Homn oQTATpnc 
JCXUXXIU oaplCllo 


AF 193766 1 cvtokine-like 
protein C17 


238 


74 


09? 


tri91 504081 




cytokine-like protein C17 


238 


74 


099 


13IO I Ji-UOJ 


HYvtnn <mr»if*Ti^ 

liuuiu oajJiv^iio 


AF 193766 1 cytokine-like 
protein C17 


238 


74 


Q91 

7Zj 


oi9 1 504081 


T-fnmn QnnipnQ 

X117111VJ odlJltllO 


cvtokine-like nrotein CI 7 


381 


81 


091 


oi8119fiR1 


Untnn cnntpriQ 


AF 193766 1 cytokine-like 
protein C17 


381 


81 


094 


cri9 1 504081 


1 XU1XXU oajJlwilo 


cytokine-like protein CI 7 


263 


98 


924 


gi8 132683 


Homo sapiens 


AF193766_1 cytokine-like 
nrotein C17 


263 


98 


09 S 


ai91 504081 

1J7"70J 


T-Tnmn ^anien? 

X XVJlllW OOfJXWlXd 


cvtokine-like nrotein C17 


591 


100 


09^ 

7ZJ 


tri8119681 

glO 1 JiUOJ 


nuiiLv/ sap iviis 


AF1 93766 1 cvtokine-like 
protein CI 7 


591 


100 


926 


gil3396317 


Homo sapiens 




2741 


99 


096 

7ZU 


ail 7075777 




vesicular inhibitory amino acid 
transporter 


2741 


99 


926 


gi31566392 


Homo sapiens 


Vesicular inhibitory amino acid 
transporter 


2741 


99 


097 


tri99 507470 


Mik mii«ii*iiiiiQ 

XYXvXO UIUOUUIUJ 


AI4 13481 protein 


2042 


92 


927 


gi3097285 


Rattus norvegicus 


ZOG 


658 


39 


927 


gi802014 


Rattus norvegicus 


preadipocyte factor 1 


653 


39 


9zo 


gllO/Oo^ /4 


jjiosopmia 

yy\ ol n r\ f\ rro r*l"Ot* 

mcidnogabicr 


fVN/fft1989n 


357 


36 






lYlUb ItlUbCUlUo 


E030025D05Rik nrotein 


1600 


89 


928 


gj6624073 


Homo sapiens 


AC007743_1 similar to 
hepatitis delta antigen 

tritf»ropf incr nrotpin A 
U 1 LCI al> llllg piULClH jTV 


1755 


93 


929 


gil4250638 


Homo sapiens 


AAH08783 Similar to DNA 
segment, Chr 17, human 
n^<2^4P 


864 


97 


9?9 


ei3941733 


Mus musculus 


AAC82476 BAT4 


582 


70 


929 


gi4337106 


Homo sapiens 


AAD18082 BAT4 


864 


97 


930 


gi27476065 


Oryza sativa 
(japonica cultivar- 
group) 


Putative 

phosphate/phosphoenolpyruvate 
translocator protein 


266 


30 


930 


gi5911433 


Rattus norvegicus 


AF1 827 14_1 putative 

phosphate/phosphoenolpyruvate 

translocator 


621 


88 


930 


gi9759107 


Arabidopsis thaliana 




282 


30 
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rllt__ll/ 




Tlpcrrintifin 


S score 


Percentage 
Identity 








nh o^ nhate/nh osohoenolo vru vate 
translocator nrotein-like 






071 


oi 1 ^777RQ5 

gi iDZt /oyj 


Unmn Rjmipn** \ 

I lUillvJ oa^Jit/iio ■ 


AAH12939 Similar to 
cardiotrophin-like cytokine; 
neurotrophin- 1/B-cell 
stimulating factor-3 


1204 


99 


931 


*n 16356643 


Homo saoiens 


cardiotrophin-like cytokine 


1204 


99 




ffi6007643 


Homo saoiens 


neurotrophin- 1/B-cell 
stimulating factor-3 


1204 ! 


99 


932 


ri 18490933 


Homo sapiens 


FLJ2 1269 protein 


846 


98 


932 


ei20268674 


Mus musculus 


MT-MC1 


715 


82 


932 
s j ^ 


ri22003732 


Homo sapiens 


AF527367 1 MTLC 


853 


99 


933 


ei 15 982236 


Mus musculus 


putative methionyl 
aminopeptidase 


1095 


94 


933 


gi23306398 


Arabidopsis thaliana 


, putative 


744 


50 


033 


ai 7480077 1 


Arahidonsis thaliana 


, putative 


744 


50 


034 


m]336013 


mu<iciilus 

lYlVIO ill UOv ULUJ 


neurexophilin 2 


550 


45 


034 


d22477181 


Homo saniens 


Similar to neurexophilin 4 


1649 


99 


934 


ffi4 104963 


Rattus norvesicus 


neurexophilin 4 


1493 


90 


935 


gil2852913 


Mus musculus 


unnamed protein product 


193 


75 


935 


gi26326067 


Mus musculus 


unnamed protein product 


193 


75 


50 / 


ml 03871 36 


T-Tnmn CQmpnc 


AF479748 1 PYRIN- 
containing APAFl-like protein 
5 


874 


99 


937 


gi202806 


Rattus norvegicus 


vasopressin receptor 


561 


68 


937 


gi28436366 


Homo sapiens 


NALP6 


874 


99 


OIB 


m 1 1 391395 


flUIIlU oapiCllo 


A F3 11862 1 Lin-7b 


1030 


100 




<ri9ft381 103 


T-Tntnr* camVnc 
nuilHs oajJlClio 


r in-7h nrotein* likelv ortholoe 

of mouse LEN-7B; mammalian 
LIN-7 protein 2 


1030 


100 


038 


ri3885828 


Rattus norvefficus 


lin-7-A 


1019 


98 


030 


pi 14349 125 


Hfimn «?ar>tens 


aloha2- elucos vl transferase 


738 


96 


030 


3? 49025 9 


Orvza sativa 
(japonica cultivar- 

ETOUd} 


OSJNBb0116K07.l 


190 


36 


939 


gi351345l 


Rattus norvegicus 


potassium channel regulator 1 


718 


93 


040 


ei 13325 140 


Homo sapiens 


AAH04383 


2693 


100 


04ft 


2i35768 


Homo saoiens 


polypirimidine tract binding 
protein 


2693 


100 


04ft 


tn35774 


Homo ^aniens 




2693 


100 


941 


gi21 522774 


Homo sapiens 


unnamed protein product 


3068 


100 




m94.fi47994 




Similar to ROF-1 ike-domain 

lD till liCU V\J L/VJ1 lll^W UUllKUllj 

multiple 6 


3048 


99 




glO/JZOJO 


riomo sapiens 


AP1S6AS4 1 pniffprmal orou/th 
fart or rpnpsit rontaininp nrotein 


3043 


99 


942 


gi2 1522772 


Homo sapiens 


unnamed protein product 
r r 


3102 


100 


942 


gi24047224 


Homo sapiens 


Similar to EGF-like-domain, 
multiple 6 


3043 


98 


942 


gi6752658 


Homo sapiens 


AF186084_1 epidermal growth 
factor repeat containing protein 


3038 


98 


943 


gil 1385648 


Homo sapiens 


AF273045_1 CTCL tumor 
antigen sel4-3 


3867 


99 


943 


gil7980969 


Homo sapiens 


AF454056 1 sel4-3r protein 


5146 


99 


943 


gi29 165763 


Mus musculus 


3632413B07Rik protein 


5213 


82 
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SEQJQ) 


HitID 


Species 


Description 


S_seore 


Percentage 
Identity 


944 


gil3677201 


Homo sapiens 




2771 


100 ; 


944 


ei 17980969 

&l A. 9 ^ \J\J ^ \J S 


Homo sapiens 


AF454056 1 sel4-3r protein 


3140 


99 


944 


gi29165763 


Mus musculus 


3632413B07Rik protein 


3613 


89 


945 


gil 1385648 


Homo sapiens 


AF273045J CTCL tumor 
antigen sel4-3 


3806 


94 


945 


gil7980969 


Homo sapiens 


AF454056 1 sel4-3r protein 


5085 


95 


945 


gi29 165763 


Mus musculus 


3632413B07Rik protein 


5492 


85 


946 


gil 1385648 


Homo sapiens 


AF273045J CTCL tumor 
antigen sel4-3 


3806 


94 


946 


ei 17980969 

A. 1 ~f \J\J *f \J */ 


Homo ^aniens 


AF454056 1 sel4-3r protein 


5085 


95 


946 


ei29 165763 


Mils musculus 


3632413B07Rik orotein 


5566 


87 


947 


ei 14043211 


Homo saniens 

X XYSXXXVS uUl/lwllu 


AAH07594 Similar to RIKEN 
cDNA 4931428F04 gene 

VA^A^-£»» 1T*#U1 V 1 fcjVilV 


2410 


98 


947 


gi21739633 


Homo sapiens 


hypothetical protein 


2430 


97 


947 


ei25058997 


ttihspiiIiis 

1VXUD XlllXOtsUlUO 


1 1 10003N12Rik orotein 


941 


63 


949 


ei 19387136 


Homo saniens 


AF479748 1 PYRIN- 
containing APAFl-like protein 
5 


1735 


99 


949 ! 


gi202806 


Rattus norvegicus 


vasopressin receptor . 


1030 


64 


949 


gi28436366 


Homo sapiens 


NALP6 


1735 


99 


950 


ei20338417 

CiiUUJ JUT 1 # 


flail us ff alius 


Dotassium channel subunit 


5079 


88 


950 


ei3875660 


Oaenorhahditi s 
elegans 




2164 


45 


950 


gi3978472 


Rattus norvegicus 


potassium channel subunit 


5376 


90 


951 


gil8147612 


Homo sapiens 


metalloprotease disintegrin 


4376 


96 


951 


gi21908028 


Homo sapiens 


AF466287_1 a disintegrin and 
metallonrotease domain 33 


4360 


96 


951 


gi2 1908030 


Homo sapiens 


a disintegrin and 
metalloprotease domain 33 


4360 


96 


952 


gil2841733 


Mus musculus 


unnamed protein product 


715 


92 


952 


gil 8606367 


Mus musculus 


RIKEN cDNA 4930570C03 


715 


92 


952 


gi3 1581976 


Homo sapiens 


FU20489 protein 


472 


100 


953 


gi 15420879 


Mus musculus 


AF39897M ankyrin repeat- 
containing SOCS box protein 
10 


2049 


83 


953 


gil 803 1949 


Mus musculus 


SOCS box protein ASB-18 


800 


44 


953 


gil8092200 


Homo sapiens 


AF417920 1 ASB-10 


2174 


91 


954 


gi32707 


Homo sapiens 


interferon-omega 1 


337 


51 


954 


gi386800 


Homo sapiens 


interferon-alpha 


340 


51 


954 


ei491284 


svnthetic construct 

w TilUlvilv vUIluli UwV 


IFN-oseudo-omeea 2 


799 


98 


955 


gil5928971 


Homo sapiens 


AAH1495 1 Similar to neuronal 
thread orotein 

UXX WCXVX Ut Uiwlll 


430 


90 


955 


ei9844579 


Hnmn sumpns 




450 


97 


955 




Hr>tT)n cum pti q 




623 


84 


956 


gill559412 


Homo sapiens 


NADPH-dependent retinol 
dehydrogenase/reductase 


587 


100 


956 


gil2804321 


Homo sapiens 


AAH03019 peroxisomal short- 
chain alcohol dehydrogenase 


685 


100 


956 


gil9113668 


Homo sapiens 


NADP-dependent retinol 
dehydrogenase short isoform 


878 


100 


957 


gi22658418 


Mus musculus 


cDNA sequence BC030934 


1499 


68 


957 


gi28838433 


Homo sapiens 


DKFZp762A2013 protein 


1759 


82 


957 


gi30842594 


Homo sapiens 


putative sulfhydryl oxidase 
precursor 


1668 


78 
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Percentage 

i. VI bVUIHCV 
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958 


gi 12958660 


Homo sapiens 


AF321918_1 acid phosphatase 


2252 


100 


958 


gil2958663 


Homo sapiens 


AF321918_4 acid phosphatase 
variant 3 


1285 


99 


958 


gi52871 


Mus musculus 


lysosomal acid phosphatase 


832 


45 


959 


gil 1493443- 


Homo sapiens 


AF130117 27 PRO2209 


1703 


100 


959 


gi28966 


Homo sapiens 


alpha 1-antitrypsin 


1703 


100 


959 


gi6855601 


Homo sapiens 


AF1 13676 1 PRO0684 


1703 
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1026 


gil4150450 


Rattus norvegicus 


AF241241_1 UDP- 
GalNAc.polypeptide N- 
acetylgalactosaminyitransferase 
T9 


1350 


93 


1026 


gi25 809274 


Homo sapiens 


polypeptide N- 

acetylgaladtosaminyltransferase 

10 


1390 


97 


i mfc 

1UZO 


tri9R9A8£7£ 
glZoZOoO /O 


fiomo sapiens 


u ur-rN-aceiyi-aipna-u- 

rrol ci/"»t noo m inp *t*/1i vnpntinA 

goiaciosainiiic.puiypcpuijc 1N- 

Q/»*»-f-t/1 (TQio f*f"rtccifn iti\/1 f~T*?jn CT/pfd of* 
auc l y l gal d^iubcu 1 11 11 yi u alio 101 aou 

10 


i JO^t 




1097 


oil S9 17067 


J-Trvrn r\ campnc 
nuniu odpiCIlo 


AFA00416 1 cipm r#»11 fsirtnr 
Arwutju i oiciii veil laviui 

icA^nrm 1 

lovJIUllll 1 


101Q 

1U 17 




1027 


gil827477 


Felis catus 


stem cell factor 


896 


84 


1097 




nomo sapiens 


sicm ecu iacior 


101Q 
11/12/ 




1028 


gil377895 


Homo sapiens 


OB-cadherin-2 


1572 


56 


1098 


«ri!0171 QOS 


T-Tnmfi cunipnQ 


f*aHhpri n— 9/1 

IrdUllvI 111 i.*T 


9791 


93 


1028 


gi30171998 


Homo sapiens 


cadherin-24 variant 


2987 


99 


1029 


gil377895 


Homo sapiens 


OB-cadherin-2 


1621 


60 


1029 


gi30171995 


Homo sapiens 


cadherin-24 


2770 


99 


1029 


gi30171998 


Homo sapiens 


cadherin-24 variant 


2721 


93 


1030 


gil398903 


Mus musculus 


Ca2+ dependent activator 
protein for secretion 


6763 


94 


1030 


gi21541504 


Homo sapiens 


AF458662_1 calcium- 
dependent activator protein for 


6440 


93 
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secretion protein 






1030 


gi577428 


Rattus norvegicus 


Ca2+-dependent activator 

nrnfrpiiv palriiim-dpnpiidpiTf" 

actin-binding protein 


6449 


93 


1031 


gil 1071729 


Homo sanierK 


nntative dinpntidasp 


1847 
lot/ 


QQ 

yy 


1031 


gill 125344 


Homo saniprK 


mitaHvp mptallnnpntidncp 


1^10 
10 Ly 


77 


1031 


gi32490515 


Mus mu<;ciilii^ 


nntativp mprnViranp-Hminr! 
dmeotidase-3 


LO 10 


71 
/I 


1032 


gil 1493652 


Homo sapiens 


AF200708 1 calcium channpl 
blocker resistance protein 
CCBR1 


2559 


inn 


1032 


gil3924720 


Homo sapiens 


AF252872_1 cystine/glutamate 
transporter xCT 


2552 


100 


1032 


gil5082352 


Homo sapiens 


AAH12087 member 11 


2552 


100 


1033 


gi 17028348 


Homo sapiens 


DKFZP586G1517 Drotein 


^748 

■J i to 


inn 

l viz 


1033 


gi20987924 


Mus musculus 


2410004L15Rik protein 


3473 


92 


1033 


gi29612455 


Mus musculus 


24 1 0004L 1 5Rik nrotein 


^807 




1034 


gil9352987 


Homo sapiens 


Similar to KIAA0433 protein 


6348 


98 


1034 


gi2887437 


Homo sanien** 


KlAA04n 




0Q 

yy 


1034 


gi3 1418648 


Mus musculus 




4981 


97 


1035 


ei 11 066463 


RattiiQ norvpcrinic 


AF995Q61 1 RfioOPR 

glutamate transport modulator 
GTRAP48 


OOOO 


oU 


1035 


gi!9387126 


Mus musculus 


AF467766 1 Piianinp 
nucleotide exchange factor 


1778 


11 

oo 


1035 


gi71 10160 


Homo sapiens 


Guanine mirlpntidp pvf*Fiano"p 
factor 


17Q9 


18 

JO 


1036 


gil0726794 


Drosophila 
melanogaster 


CG5521-PA 




15 
oo 


1036 


gi24061707 


Mus musculus 


GAP-related interacting oartner 
to E12 


986 


97 


1036 


gi4240257 


Homo sapiens 


K1AA0884 protein 


2491 


100 


1037 


gi20269957 


Sus scrofa 


AF498759J phospholipase C 
delta 4 


1472 


85 


1037 


gi21307610 


Mus musculus 


DhosDholinase C delta 4 


1327 


77 


1037 


gi571466 


Rattus norvegicus 


nhosDholioase C delta-4 


1295 


76 \ 


1038 


gil6552885 


Homo sapiens 


unnamed protein product 


2084 


99 


1038 


gi26326051 


Mus musculus 


unnamed Drotein oroduct 


1085 


54 


1038 


gi26327387 


Mus musculus 


unnamed protein product 


1085 


54 


1039 


gil8480186 


Mus musculus 


olfactory receptor MOR261-6 


1323 


81 ! 


1039 


gi32052343 


Mus musculus 


olfactorv recentnr 
GA_x6K02T2P3E9-4384160- 
4383228 




81 


1039 


gi9368991 


Homo sapiens 




14m 


inn 


1040 


gi29791964 


Homo sapiens 


Thrombo^nondin 4 


4708 
*t / yo 


QQ 

yy 


1040 


gi3 11626 


Homo sapiens 


thrombospondin-4 


4787 


99 


1040 


gi3860231 


Mus musculus 


thrombospondin-4 


4557 


93 


1041 


gil4043083 


Homo sapiens 


AAH07524 sperm associated 
antigen 9 


660 


100 


1041 


gi24460121 


Homo sapiens 


AF327452J JNK-associated 
leucine-zipper protein 


273 


98 


1041 


gi29169179 


Homo sapiens 


PHET 


343 


98 


1042 


gi21654741 


Homo sapiens 


peptide/histidine transporter 


2771 


95 


1042 


gi2208839 


Rattus norvegicus peptide/histidine transporter 


2344 


82 
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Trlpnrirv 


1042 


gi33126130 


Homo sapiens 


peptide/histidine transporter 


2736 


94 


1043 


gi22831474 


Drosophila 
melanogaster 


CG14622-PC 


ZDUo 


47 
4/ 


1043 


gi22831475 


Drosophila 
melanogaster 


OO140zZ-rD 


7^ns 


47 


1 A ill 

1043 


gi29477075 


Mus musculus 


oimuar to aisneveneo 

ooSOCldXCU at/llVdLUl Ul 
tnAmriAOfpnPCic 1 

lliUipiIUgCllVolO 1 


Z.JmCl 


Ql 


1044 


gi!5929979 


Homo sapiens 


AAH15418 Similar to zinc 

fin per nrotein 14 S 


2476 


100 


1 f\AA 
IU44 


m* 1^/11 77 A1 


\^iic Tniiofiiiiic 

ivius mubcuiuo 


R210112T1 8Rik nrotein 


1788 


57 


1 f\AA 


gljOoU iJCt 


norno bupicub 


AC007842 1 BC11U91 1 

AV^UU / 0*TZ. J UwJ 11-7 1. 1 


1922 


52 


1045 


gil2655913 


Homo sapiens 


AF227516 1 sprouty-4A 


386. 


98 


1 A A C 

1045 


gll/055yiJ 


Homo sapiens 


A 17777^1 7 1 crvrruitv-4P 


JOU 


98 


1 AvIC 
1045 


glZ9 /4/yuo 


Mus musculus 


oprouiy noinoiog t 


■jA^Kj | 


81 


1 A A C 

1046 


gl29oy24yo 


Mus musculus 


iNAA^j-pepuaase u 




88 
oo 


1 C\A C 

1046 j 


gi32 11746 


Sus scrofa 


ioiyip 01 y-gamma- gi uiamate 
caroojcypcpii u dbc 


9R1Q 

ZO 17 


70 


1046 


gl45iy5Zj 


— — — 

Homo sapiens 


MA AT AFiqcp TT Tvmfpin 
lN/vrVL^/VUdbC 11 piULCIIl 


JOO 1 


100 

1 \J\J 


104/ 


gizi /juuuy 


Homo sapiens 


unndiTicu protein pruuuoi 


1414 

It 1*T 


99 


1 C\AH 
IU4/ 


rrJOQ^I 7742 

glZ3DlZZ4o 


riomo sapiens 


Similar tr\ HT^PO Tntprarrincr 
Protein 2 


676 


53 


1fi47 


m'7644Q76Q 




fivnnthptical nrotein 


1421 


99 


1fl4R 

lUHO 


oi<?Q18167 


f-TAtnn cjinipriQ 
IlUlllu octpiviio 


nlexin-Bl/SEP recentor 


3578 


42 


1048 


gi6651051 


Mus musculus 


AF133093 2 plexin6 


3147 


40 


ll/4o 


mQ8R^75Q 


Haiti cqaii*tic 
numo bapi CI lb 


AF149019 1 nlexin-B3 


3140 


40 




gll3UO 1 jyi. 


Ur\mA CQnSpnc 
noniu bapiCIlo 


AF195817 1 NAC1 Drotein 


1268 


55 


1049 


gi30931339 


Mus musculus 


Nacl-pending protein 


1254 


57 


1 i\ac\ 

1049 


gijj^yz/Di 


Homo sapiens 


proicin 


1 A*\J(J 


55 


lUDU 


m 1 1 AQ7S0.7 

gi i loyzouz 


riomo sapiens 


AT717fY994 1 ARPfrS 


3123 


99 


1050 


gil5088540 


Homo sapiens 


AF324494 1 steroiin-2 


3127 


99 


1 ACA 

1050 


gllM4o444 


Homo sapiens 


A 5 1 R9 A 1 cfprAl in-9 


11 17 


99 


1051 


gil2652851 


Homo sapiens 


AAH00178 potassium channel 
mouuidiory lacior 


mi 


100 


1 A< 1 

1051 


glZo4DJjJ0 


— — : 

Homo sapiens 


T-rinr 1 ! 
ri\Jv>i 


1981 

1 70J 


99 


t AC 1 

1051 


gl/O//05o 


Homo sapiens 


ArUjOjZ^i poiassiurn 
channel modulatory factor 


1Q81 

170J 


99 

77 


1 AO 

1052 


gijjiys 


Homo sapiens 




701 


70 


1052 


•no to A 

gi33730 


Homo sapiens 


lmmunogiODUiin lamoaa tigni 
chain 


716 


71 


1 Afl 

1052 


gi33734 


Homo sapiens 


immunogiouuiin lamoaa ugni 
chain 


716 


71 
/ i 


1053 


gi21388773 


Homo sapiens 


knngle-containing protein 


1764 


80 


1053 


• 1 O OOT7C 

gi21388775 


Homo sapiens 


kringle-containing protein 




78 

/ 0 


1053 


gi2 1623530 


Homo sapiens 


kringle-c ontai ning 
transmembrane Drotein 


IHJO 


00 


1054 


gil4495324 


Homo sapiens 


CMRF35A 


432 


48 


1054 


gil8490143 


Homo sapiens 


CMRF35 leukocyte 
immunoglobulin-like receptor 


432 


48 


1054 


gi396170 


Homo sapiens 


CMRF-35 antigen 


432 


48 


1055 


gi4468255 


Homo sapiens 


MHC class I antigen 


1925 


98 


1055 


gi4468256 


Homo sapiens 


MHC class I antigen 


1974 


100 


1055 


gi487909 


Homo sapiens 


HLA-A11 antigen Al 1.1 


1914 


97 


1056 


gi21667214 


Homo sapiens 


AF465767 1 


741 


100 
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i a entity 








V\A nffiffni Hoi mortW d4 nl lln/- 

uacienciQai/perrneaDuiiy- 
increasing protein-like 3 






1056 




Homo sapiens 


lv i ZvjO 


1 71 




1056 


gi57732 


Rattus rattus 


potential ligand-binding 
protein 


210 


35 


1057 


gi21667214 


Homo sapiens 


A T?A /ZK*7/Z*7 1 

Ar4co /O f_i 

Dacienciaai/penneaDi liiy- 
increasing protein-like 3 


LLLd 


OQ 
yy 


1057 


gi32490539 


Homo sapiens 


KYZvO 


^74 
JZH 


J L 


1057 


gi57732 


Rattus rattus 


potential ligand-binding 
protein 


564 


32 


1058 


gi2l6672l4 


Homo sapiens 


Ar4657o7_l 
bactericidal/perm eability- 
increasing protein-like 3 


\y 10 


OQ 
yy 


1058 


gi324yu!>jy 


Homo sapiens 


Ki ZLO 




0 L 


1058 


gi57732 


Kattus rattus 


potential iigana-Dinaing 
protein 


471 


11 


lUjy 


glZl00/Zl4 


— — : : — 

Homo sapiens 


/\r*tuj /o / l 

ha r»f"f*ri r*i /fo 1 /nprm psi Vn 1 1 tv- 
Uavici lwUdi/pci iiiwauiiiiy 

innrpuQino' rirrvfpin-lilfp. 1 


1 84? 




1059 


gi32490539 


Homo sapiens 


RY2G5 


434 


31 


1059 


gi57732 


Rattus rattus 


potential ligand-binding 
protein 


473 


33 


1 fi£A 


m 11<»701 

gi lojzy oo 


rtomo sapiens 


A AH0S14G 


1128 


99 


1 a/ca 


glDZyj 14 


Oho o/^fArn 

ous scroia 


neuronal cnuucrinc piuicui 


1092 


95 


lUOU 


r»!'7'7 1 QA7Q 


Homo sapiens 


neuroenaoenne pruiciu / dz- 


1 148 


100 


1 A/£ 1 


rri 1 ^Q7QA1A 

gl I jy Zy\Jj\J 


rtomo sapiens 


a AMI 4Q71 


2325 


100 


1061 


gil6551493 


Homo sapiens 


unnamed protein product 


2321 


99 


1 A£1 


nil 8£Q8£A1 


Homo sapiens 


AF4A7441 1 ^mitfi-N/fjicrfnic 
/VrHO / HH J_ l Olllllu-lVlagvIllb 

cvnHrmnp fVirom rtCrtin p rpcrinn 

candidate 7 protein 


2^25 


100 


1 A£7 


gil jOHOUol 


K/fiic* mi t cvm line 


ClauUHl U 


822 


70 


1062 


gi4128041 


Homo sapiens 


claudin-9 protein 


1116 


100 


lOoz 


gi4jZj/yo 


Mus musculus 


ciauain-7 


1078 


95 
y*j 


1063 


gi 1215742 


Homo sapiens 


HIP 


434 


65 


1063 


gil42oo25o 


Homo sapiens 


AAriuoyzo noosomai protein 

r 9Q 

L*z.y 






1 A/C1 


*«7O10vl1 


— — : 

Homo sapiens 


riDosomdi proLoin jl/>7 


4^4 


65 


1U04 


gi4jo/cyo 


Kaixus norvegicus 


A £077^00 1 oliitamatp 
i\r u / z«? v2*__ i gi uuimaic 

JLCWCpiUI lllldav/llUg piuiwui a. 


jj*ty 


86 


1 A£/l 


mA711 7fi7 
gl4/J IZ5/ 


Kauus norvegicus 


(yliitamafp rpppntnr lntprnptirKy 

^lULalliaiv ICUCpLUl illiCiaUlillg 

nrAtpiTi 0 


3281 


81 


1064 


gi6601555 


Rattus norvegicus 


glutamate receptor interacting 
pruicin Z 


3549 


86 


1 A£< 


glZJ4yo44Z 


Kaitus norvegicus 


UlbdUlCU- i 


2807 


96 


1065 


ei3288852 


Homo saDiens 


disabled- 1 


2865 


99 


1065 


gi8118615 


Homo sapiens 


AF263547 1 disabled-1 


2842 


99 


1066 


gil6877456 


Homo sapiens 


AAH16974 


1711 


100 


1066 


gi20810324 


Homo sapiens 




1410 


86 


1066 


gi26351033 


Mus musculus 


unnamed protein product 


1236 


76 


1067 


gil5430703 


Homo sapiens 


AF362953_1 testis specific 
serine/threonine kinase 2 


1858 


99 


1067 


gi2738898 


Mus musculus 


protein kinase 


1683 


89 


1067 


gi33590489 


Rattus norvegicus 


serine/threonine kinase 22B 


1754 


92 
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Hi* m 


Qn on ttXC 

opccica 


Tt AC j* fi rtfi An 

uescnpuoii 


o^score 


Percent age_ 
lucniiiy 


1068 


gil2963879 


Homo sapiens 


prostaglandin D synthase 


980 


96 


1068 


oil ^4^68 


nomo sapiens 


r 1 vjruo proiein 




yo 


1068 


oi18Q779 
giio? / /z 


nomo Sapiens 


proscdgiaiiuui vjjl synuiase 


osn 


yo 


L\j\jy 


oil d/^671 8 
gilt J JO / 10 


jnomo Sapiens 


/vDuuo z to £ t_ 1 0 similar to 
n/wjn 


1 1 ^7 

i ij i 


1 AA 

1UU 


1069 


oi90Q8888S 


ivxuo muscuius 


98 1 001 4T9TRiV nrnt^in 


i i^i 

L 13 J 


7Q 

fy 


1069 


gi2459803 


Rattus norvegicus 


RSP29 


645 


48 


1U /u 


giijjy /ojj 


Homo sapiens 


annexin A13 isoform b 


L J9j 


99 


1070 


gi21218387 


Oryctolagus 
cunicuius 


AF510726_1 annexin XHIb 


1589 


88 


1070 


gi757784 


Canis familiaris 


annexin Xlllb 


L621 


89 


1 A71 
IU / 1 


rn'9Ail999 

glZU4ZZZ 


Rattus norvegicus 


GAB A transporter protein 


"5 Art A 

3094 


96 


iU/ J. 


mil 7A7OA0 

gi/i /u/yuo 


Homo sapiens 


, member 1 


3 1Z6 


98 


i mi 
IU / 1 


glilOJO 


Homo sapiens 


CjABA transporter 


oiii 
311 1 


98 


l\J fZ 


gll4lOJ I /O 


Rattus norvegicus 


Ar 3 7 S093_ 1 sodium channel 
beta 3 sub unit 


823 


98 


1079 


gi / iouy /d 


— — : : 

Homo sapiens 


voltage-gated sodium channel 
oeiao suounit 




1 AA 
1UU 




cri716188Q 
gi / 101007 


ivauus norvegicus 


iml tq rro mfn/i centum nnannal 

voiiage-gaieu soaium cnannei 
beta-3 subunit 


89^ 


yo 


107^ 


0190^81966 


nuiiiu sapiens 


vjiypicdn z 




1 OA 
1UU 


1073 


gi440127 


Rattus norvegicus 


cerebroglycan 


2506 


82 


1071 


oiSOl 1190 

glJ7 I IjZu 


ivius muscuius 


Ariujzoo i giypican-o 


I 104 


/I /I 

44 


1074 


gil8676470 


Homo sapiens 


FLJ00 132 protein 


2515 


99 


IU/4 


giiy,544uoo 


Mus muscuius 


O7AAA1O13A0D Jly 

z/UUUiobUoKiK protein 


J4U / 


*7*7 

77 


1074 


gi23274106 


Mus muscuius 


2700038E08Rik protein 


3407 


77 


i a*7c 


gl2;>39o3o7 


Homo sapiens 


alpha 2,6-sialyltransferase 


2844 


100 


1075 


gi27650880 


Homo sapiens 


beta-galactoside alpha-2,6- 
sialyltransferase 


1183 


100 


1075 


gi45275 1 


Gallus gall us 


Gal beta 1,4 GlcNAc alpha 2,6- 
sialyltransferase 


943 


54 


1076 


gi 13344995 


Homo sapiens 


Cat Eye Syndrome critical 
region protein isoform 1 


2002 


99 


lU/o 


gi 13344997 


Homo sapiens 


Cat Eye Syndrome critical 
region protein isoform 2 


2223 


100 


1076 


gi27503696 


Homo sapiens 


Similar to cat eye syndrome 
chromosome region, candidate 
5 


2223 


100 


1077 


tmn a a c\e\c 

gi 13344995 


Homo sapiens 


Cat Eye Syndrome critical 
region protein isoform l 


1662 


96 


1077 


gi 13344997 


Homo sapiens 


Cat Eye Syndrome critical 
region protein isoform 2 


1662 


96 


1 A77 
IU// 


rri77<AQ/:0< 
glZ/DUJoi/O 


— ; 

Homo sapiens 


Similar to cat eye syndrome 
chromosome region, candidate 

c 
J 


1662 


96 


1078 


gi 177870 


Homo sapiens 


al nh a-2-m aero p! obu I i n 
precursor 


2718 




1078 


gi25303946 


Homo sapiens 


alpha-2-macroglobulin 


2718 


39 


1078 


gi579592 


Homo sapiens 


alpha 2-macroglobulin 690-730 


2712 


39 


1079 


gi25303946 


Homo sapiens 


alpha-2-macroglobulin 


1290 


35 


1079 


gi579592 


Homo sapiens 


alpha 2-macroglobulin 690-730 


1290 


35 


1079 


gi579594 


Homo sapiens 


alpha 2-macroglobulin 690-740 


1291 


36 


1080 


gi25303946 


Homo sapiens 


alpha-2-macroglobulin 


761 


31 


1080 


gi671864 


Gallus gallus 


ovomacroglobulin, ovostatin 


792 


32 
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xilt__JLU 


species 


U Cotl 1UL1UU 


S score 


Percentage^ 
Identity 


lUov 


cri671RfiS 
510 / IOOJ 


Online oqIIiic 


AVomacrAplobuIin ovostatin 


792 


32 


1 na 1 


ml 77R7fi 

gl 1 / tO /U 


nuinu sapicnb 


alnha-7-maerAplobiilin 

precursor 


2736 


39 


1081 


gi25303946 


Homo sapiens 


alpha-2-macroglobulin 


2736 


39 


l\)o l 


oiS7Q5Q7 


Wntnr* cnnipnc 


alnha 2-macroelobulin 690-730 


2730 


39 


1082 


gi25303946 


Homo sapiens 


alpha-2-macroglobulin 


1290 


35 


1Uo2 


gD lyjyL 


riomo sapiens 


nlnhn 7-marrAjrlnhtllin 690-730 


1290 


35 


1 AQ1 


glO U jy** 


Ufttnrv compile 


alnha 9-macrAplobulin 690-740 


1291 


36 


1 f\Q1 


gll /D l/jOL 


ivius muscuiub 


i»cff»raQf» 


2029 


66 


1083 


gi29476863 


Mus musculus 


SimUar to esterase 31 


2022 


66 


1083 


gj404389 


Mus sp. 


carboxylesterase; Es-male 


2001 


66 


1084 


gi207286 


Rattus norvegicus 


1 ijr-ueia masKing pruicui 

toigC bUUUIHL 


8721 


89 


1084 


gl26006334 


— : 

Mus musculus 


idle III umloiuriiiiiig giuwm 
fartnr hpta hinHinf nrotein 1L# 


8630 


88 


1 ACM 


gij^yj 1/0 


AAiic TYincr»i line 


latent TGF beta bindine orotein 


8627 


88 


1085 


gil7985371 


Homo sapiens 


13 binding protein 


861 


100 


1053 


gUo400oUo 


riorno sapiens 


AF9S^671 1 cervical cancer 1 
nrnt n-nnPA^ene-bindinff orotein 

KG19 


853 


99 


lUoj 


crJOl 0^1770 

giziyotzzy 


nomo bdpiciib 


RRTC hindinp nrotein 

iJlVlJ UlllVlll&g pi VW1U 


861 


100 


1086 


glZZ/OJJ 


/"Valine rrollitc 

oanus gaiius 


N>f -nrotpi n 

IVi fJlUlWlll 


2924 


42 


1086 




mus in use Ul us 


1V1 pi u icu 1 


2908 


42 


1086 


gi407097 


Homo sapiens 


165kD protein 


2912 


42 


1087 


gllZOJJ lOD 


nomo sapiens 


AAHft14^8 71 nr finder nrotein 
256 


693 


65 


lUo/ 


m1fKfc7^4S 
gljUJOZjHJ 


T-J r\m r\ coni pre 


7inc fincer orotein 256 


693 


65 


1 AB*7 

lUo/ 




U/\rnn cam pne 
llvJlUU ocipiClla 


AF067165 1 zinc fineer 
protein 3 


693 


65 






Hattia ^aniens 


zinc finger protein z£p6 


311 


49 


lUoo 




HVvnn ^aniens 


zinc finger protein 256 


309 


56 


i acr 


<xi4RQ4'*64 


Hnmn canien*? 
jnuuivj oapituo 


AF067165 1 zinc finger 
protein 3 


309 


56 


10R0 


cri 12655452 


Homo saoiens 


keratin associated protein 4.7 


981 


76 


IU07 


oil 2655460 


Haitio saoiens 


keratin associated protein 4.12 


970 


77 


1089 


gi 12655464 


Homo sapiens 


keratin associated protein 4.15 


973 


81 


i Hon 


cri 1765^446 


HnmA ^aniens 


keratin associated protein 4.4 


400 


69 


inon 
i\jy\j 


ml 2655452 


Hatha ^aniens 


keratin associated protein 4.7 


383 


81 




01176^460 


Unmn 9 aniens 


keratin associated protein 4.12 


400 


61 


1091 


gil2655452 


Homo sapiens 


keratin associated protein 4.7 


1219 


90 


1 AQ1 


gl IZO^J'+OU 


ilUIllU Saplvllo 


keratin associated orotein 4. 12 


1158 


88 


1 AA1 

iuyi 


gllZOODW** 


t-Tnmrv com f»n c 

riomo sapicnb 


keratin associated orotein 4.15 


1260 


100 


1092 


gi 15722084 


Homo sapiens 




1991 


100 


1092 


gi43430o 


Homo sapiens 


lySUbUIIlal aUlU lipdoc, oiciui 


1289 


63 


L\JZ7£> 


ei506431 


Homo sapiens 


lysosomal acid lipase 


1289 


63 


1093 


gil5722084 


Homo sapiens 




1935 


100 


1093 


gi434306 


Homo sapiens 


lysosomal acid lipase; sterol 
esterase 


1289 


63 


1093 


gi506431 


Homo sapiens 


lysosomal acid lipase 


1289 


63 


1094 


gi20152322 


Homo sapiens 


putative G-protein coupled 
receptor 


1558 


99 


1094 


gi32526601 


Homo sapiens 


GPRC5D 


1558 


99 


1094 


gi81 18040 


Homo sapiens 


AF209923 1 orphan G-protein 


1804 


99 
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coupled receptor 






1095 


gi!509995l 


Mus musculus 


AF384160_1 diacylglycerol 
acyltransferase 2 


596 


49 


1095 


gil8l29609 


Homo sapiens 


AF384161J diacylglycerol 
acyltransferase 2 


597 


49 


1095 


gi27693972 


Mus musculus 


diacylglycerol 0- 
acyltransferase 2 


596 


49 


1096 


gil7224598 


Homo sapiens 


AF293615J blood dendritic 
cell antigen 2 protein 


1134 


95 


1096 


gi 17225337 


Homo sapiens 


AF325459 1 dendritic lectin 


1134 


95 


1096 


gil7225339 


Homo sapiens 


AF325460_1 dendritic lectin b 
isoform 


930 


80 


1097 


gil7224598 


Homo sapiens 


AF293615J blood dendritic 
cell antigen 2 protein 


1182 


99 


1097 


gi 17225337 


Homo sapiens 


AF325459 1 dendritic lectin 


1182 


99 


1097 


gil7225339 


Homo sapiens 


AF325460J dendritic lectin b 
isoform 


978 


84 


1098 


gil8479834 


Mus musculus 


olfactory receptor MOR144-1 


1220 


77 


1098 


gi21929119 


Homo sapiens 


seven transmembrane helix 
receptor 


1595 


100 


1098 


gi32063297 


Mus musculus 


olfactory receptor 
GA_x6K02T2PVTD- 
14025733-14026668 


1220 


77 


1099 


gi 19526645 


Homo sapiens 


AF430017JI intestinal 
membrane mucin MUC17 


775 


33 


1099 


gi591H69 


Homo sapiens 


AF147790_1 transmembrane 
mucin 12 


3049 


99 


1099 


gi5911171 


Homo sapiens 


AF147791J mucin 11 


671 


54 


1100 


gi2 19497 


Homo sapiens 


biliary glycoprotein 


446 


34 


1100 


gi3172151 


Homo sapiens 


BGPg_ HUMAN 


446 


34 


1100 


gi37198 


Homo sapiens 


TM1-CEA preprotein 


446 


34 


1101 


gi 1504040 


Homo sapiens 




4709 


60 


1101 


gi6273399 


Homo sapiens 


AF200348_1 melanoma- 
associated antigen MG50 


4709 


60 


1101 


gi7292259 


Drosophila 
melanogaster 


CG12002-PA 


2660 


38 


1102 


gil504040 


Homo sapiens 




4596 


59 


1102 


gi6273399 


Homo sapiens 


AF200348_1 melanoma- 
associated antigen MG50 


4596 


59 


1102 


gi7292259 


Drosophila 
melanogaster 


CG12002-PA 


2606 


38 


1103 


gil0435776 


Homo sapiens 


unnamed protein product 


4413 


99 


1103 


gill611734 


Homo sapiens 


AF245388 1 GREBla 


510 


46 


1103 


gi7264653 


Mus musculus 


AF180470J Kiaa0575 


3121 


53 


1104 


gi 165 19041 


Drosophila 
melanogaster 


AF427496J occludin-like 
protein 


184 


23 


1 1 A A 

1104 


gj202l9008 


Chlamydomonas 
reinhardtii 


AF394181_1 coiled-coii 
flagellar protein 


673 


36 


H04 


gi730l55l 


Drosophila 
melanogaster 


CG6059-PA 


169 


19 


H05 


gi 126545 11 


Homo sapiens 


Torsin family 3, member A 


693 


96 


H05 


gi 14043 167 


Homo sapiens 


Torsin family 3, member A 


693 


96 


H05 


gil5079904 


Homo sapiens 


Torsin family 3, member A 


693 


96 


1106 


gi2 1666374 


Mus musculus 


swan 


325 


72 


U06 


gi21666376 


Mus musculus 


swan 


325 


72 
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1 106 i 


ei29747798 


Mus musculus 


3000004N20Rik protein 


704 


86 


1 107 


m 15076843 


Homo sapiens 


AF233450_1 pecanex-like 
protein 1 


2759 


68 


1107 


gil8157547 


Mus musculus 


AF237953 1 pecanex-like 3 


4201 


93 


1107 


gi6650377 


Mus musculus 


AF096286 1 pecanex 1 


2767 


67 


1108 


gi 15076843 


Homo sapiens 


AF233450_1 pecanex-like 
protein 1 


2402 


73 


1108 


cil8 157547 


Mus musculus 


AF237953 1 pecanex-like 3 


3138 


97 


1 108 ! 


ei6650377 


Mus musculus 


AF096286 1 pecanex 1 


2406 


73 


1 109 


ri21595759 


Homo sapiens 


similar to HC6 


211 


71 


1 109 


ci7020440 


Homo sapiens 


unnamed protein product 


215 


57 


1109 


gi7770237 


Homo sapiens _j 


AF1 19917 62 PR02822 


232 


61 


1110 

1 1 lvr 


d26333913 


Mus musculus 


unnamed protein product 


749 


83 


1110 


tn26343633 


Mus musculus 


unnamed protein product 


749 


83 


1110 


cri27370621 


Homo sapiens 


Similar to hypothetical protein 
FLJ31737 


828 


95 


1111 


ei 12043567 


Homo sapiens 


unc-93 related protein 


1571 


99 


1111 


gi 173909 15 


Mus musculus 


unc93 homolog B 


1367 


87 


mi 


gi23271746 


Mus musculus 


Unc93b protein 


1367 


87 


1112 


gil5990461 


Homo sapiens ' 


AAH15612 ring finger protein 
25 


2465 


100 


1112 


gil8490513 


Mus musculus 


Rnf25 protein 


1983 


82 


1112 


ffi29179411 


Mus musculus 


Ring finger protein 25 


1988 


82 


1 113 


ril9716048 

gl LSI 4 W «U 


Xenopus laevis 


Weel B kinase 


1123 


45 


1113 


gi2827996 


Xenopus laevis 


weel homolog 


1291 


51 


1113 


gi644770 


Xenopus laevis 


Weel A kinase 


1296 


51 


1115 


oil 50301 19 


Mi i<; musculus 


31 10057O12Rik protein 


777 


97 


1115 
11U 


tri23093574 


Drosonhila 
melanogaster 


CG32112-PA 


366 


42 


1115 
1 1 1 J 


<ri7'* 0,93575 


Drosonhila 
melanogaster 


CG32112-PB 


397 


47 


1 1 1£ 
1110 


oil 1403409 


Hrtmn ^aniens 


AF130117 10 PRO0898 


129 


59 


1 1 1 
1 1 10 


oi? 1708099 


Hnrrm sanien^ 


similar to Aiu subfamily SQ 
sequence contamination 
warning entry 


135 


70 


1116 


gi2880099l 


Homo sapiens 


unnamed protein product 


124 


67 


1117 
111/ 


oil 3R 10898 

gll JO 1U070 


Poffiiq nnrveoicus 


AF322216_1 inhibin binding 
jprotein long isoform 


515 


32 


1 1 17 
111/ 


m7370143 


Homo sanien^ 


immunoglobulin-like domain- 
containing I 


503 


32 


1117 
111/ 


m'2645890 


Homo saoiens 


IGSF1 


503 


32 


1118 


gi2370143 


Homo sapiens 


immunoglobulin-like domain- 
containing 1 


307 


38 


1 1 1 s 

I I JO 


oi3?330fi85 


Muq musculus 


inhibin binding protein/pl20 
long isoform 


310 


38 


1118 


gi32330691 


Mus musculus 


inhibin binding protein/p!20 
variant 4 


310 


38 


1119 


gi21595190 


Mus musculus 


25 10001 A17Rik protein 


4878 


95 


1119 


gi21707128 


Homo sapiens 


Ran binding protein 1 1 


5047 


99 


1119 


gi66506l2 


Homo sapiens 


AF1 11 109 J Ran binding 
protein 1 1 


5047 


99 


1120 


gil399805 


Homo sapiens 


Bbp/53BP2 


2078 


46 


1120 


gil6197705 


Homo sapiens 


ASPP2 protein 


2439 


47 


1120 


gi 18652832 


Homo sapiens 


ASPP1 protein 


5703 


99 
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1122 


gi2598461 


Homo sapiens 




1893 


97 


1122 


gi31418316 


Homo sapiens 


Heat shock 70kD protein 
binding protein 


1893 


97 


1122 


ei4049268 


Homo sapiens 


putative tumor suppressor 
ST13 


1893 


97 


1123 


gil 1991844 


Homo sapiens 


AF243505_1 flbrocyte-deriyed 
protein 


676 


100 


1123 


gil2619173 


Homo sapiens 


melanoma inhibitory activity 
like protein 


676 


100 


1123 


gi 12668328 


Homo sapiens 


melanoma inhibitory activity 
like protein 


676 


100 


1124 


gi22760096 


Homo sapiens 


unnamed protein product 


1047 


89 


1124 


gi27883913 


Homo sapiens 


POTE 


525 


46 


1124 


gi28279813 


Homo sapiens 


Similar to hypothetical protein 
DKFZp434A171 


743 


85 


1125 


gil 1990779 


Homo sapiens 




548 


43 


1125 


gi22760096 


Homo sapiens 


unnamed protein product 


831 


87 


1125 


gi28279813 


Homo sapiens 


Similar to hypothetical protein 
DKFZp434A171 


743 


85 


1126 


gil 1493483 


Homo sapiens 


AF130117 48 PRO2550 


265 


67 


1126 


gil872200 


Homo sapiens 


alternatively spliced product 
using exon 13 A 


259 


66 


1126 


gi7770139 


Homo sapiens 


AF119917_13 PR01722 


266 


60 


1128 


gi 16588454 


Homo sapiens 


AF312374_1 AGTRAP protein 


708 


95 


1128 


gil 6878260 


Homo sapiens 


AAH 17328 Similar to 
angiotensin II, type I receptor- 
associated protein 


726 


100 


1128 


gi9621816 


Homo sapiens 


AF165187 1 ATRAP 


708 


95 


1129 


gi 12330704 


Mus musculus 


AF333770_J cell recognition 
molecule CASPR4 


1376 


71 


1129 


gil7986216 


Homo sapiens 


AF333769_1 cell recognition 
molecule CASPR3 


1864 


98 


1129 


gi21961652 


Mus musculus 


contactin associated protein 4 


1376 


71 


1130 


gil7986216 


Homo sapiens 


AF333769_1 cell recognition 
molecule CASPR3 


6812 


99 


1130 


gil8390059 


Homo sapiens 


AF463518J cell recognition 
protein CASPR4 


4738 


70 


1130 


ei2 196 1652 


Mus musculus 


contactin associated protein 4 


4709 


68 


1131 


gi 10336504 


Homo sapiens 


UDP-GalNAc: polypeptide N- 
acetylgalactosaminyltransferase 


2014 


61 


1131 


gi21552746 


Homo sapiens 


AF410457 1 putative 
polypeptide N- 

acetylgalactosaminyltransferase 


3157 


99 


1131 


gi21552969 


Mus musculus 


AF467979J Williams-Beuren 
syndrome critical region gene 
17 


3098 


97 


1132 


gil3625176 


Homo sapiens 


AF251057 1 thrombospondin 


575 


46 


1132 


gil8490857 


Homo sapiens 


Thrombospondin 


575 


46 


1132 


gi31 127148 


Mus musculus 


2610028F08Rik protein 


860 


96 


1133 


gil 1907599 


Homo sapiens 


AF208291 1 protein kinase 
HIPK2 


857 


50 


1133 


gi5305331 


Mus musculus 


AF071070_1 protein kinase 
Myak-L 


856 


49 . 


1133 


gi5815145 


Mus musculus 


AF170304_1 nuclear body 
associated kinase 2b 


856 


49 
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1134 


gi22267965 


Homo sapiens 


Similar to KIAA1423 protein 


322 


100 


1134 


gi7243227 


Homo sapiens 


KIAA1423 protein 


322 


100 


1134 


gi7300805 


Drosophila 
melanogaster 


CG13409-PA 


171 


51 


1135 


gi 13529338 


Mus musculus 




1862 


48 


1135 


gil4571502 


Homo sapiens 


calcium-promoted Ras 
inactivator 


4174 


99 


1135 


gi4185294 


Homo sapiens 


rasGAP-activating-like protein 


1891 


48 


1137 


gil5128103 


Mus musculus 


AF397007_1 nephronectin 


2962 


87 


1137 


gil5128105 


Mus musculus 


AF397008_1 nephronectin 


2934 


85 


1137 


gil5430246 


Mus musculus 


nephronectin short isoform 


2802 


83 


1138 


gil6041675 


Homo sapiens 


AAH15704 joined to JAZF1 


2622 


100 


1138 


gi 17862954 


Drosophila 
melanogaster 


SD04959p 


904 


42 


1138 


gi30046920 


Mus musculus 


DllErtd530e protein 


1941 


96 


1139 


gi 12654929 


Homo sapiens 


AAH01311 mesenchymal stem 
cell protein DSCD75 


719 


74 


1139 


gil7512251 


Homo sapiens 


AAH19104 mesenchymal stem 
cell protein DSCD75 


716 


74 


1139 


gi7638247 


Homo sapiens 


AF242773_1 mesenchymal 
stem cell protein DSCD75 


719 


74 


1140 


gi32967231 


Homo sapiens 


TAFA3 


481 


100 


1140 


gi32967237 


Homo sapiens 


TAFA3.2 


923 


100 


1140 


gi32967243 


Mus musculus 


TAFA3 


390 


82 


1141 


gi329,67231 


Homo sapiens 


TAFA3 


738 


100 


1141 


gi32967237 


Homo sapiens 


TAFA3.2 


481 


100 


1141 


gi32967243 


Mus musculus 


TAFA3 


634 


87 


1142 


gil0443967 


Homo sapiens 


AF268610_1 THEG protein 


1934 


88 


1142 


gi20306274 


Homo sapiens 


testicular haploid expressed 
gene 


1934 


88 


1142 


gi7416134 


Homo sapiens 


testis-specific gene 


1934 


88 


1143 


gi21928259 


Homo sapiens 


seven transmembrane helix 
receptor 


1023 


100 


1143 


gi21928496 


Homo sapiens 


seven transmembrane helix 
receptor 


1023 


100 


1143 


gi21928655 


Homo sapiens 


seven transmembrane helix 
receptor 


916 


89 


1144 


gi 18480746 


Mus musculus 


olfactory receptor MOR261-10 


1278 


79 


1144 


gi21 928655 


Homo sapiens 


seven transmembrane helix 
receptor 


1456 


93 


1144 


gi32052225 


Mus musculus 


olfactory receptor 

GA x6K02T2P3E9-4341246- 

4340281 


1278 


79 


1146 


gil5779092 


Homo sapiens 


AAH 1 46 1 3 Similar to syntaxin 
18 


1295 


100 


1146 


gi30583139 


Homo sapiens 


syntaxin 18 


1295 


100 


1146 


gi30585223 


synthetic construct 


Homo sapiens syntaxin 18 


1295 


100 


1147 


gil4573319 


Homo sapiens 


AF334755 1 interleukin-1 
HY2 


812 


99 


1147 


gi 14573321 


Homo sapiens 


AF334756 1 interleukin- 1 
HY2 


812 


99 


1147 


gi 18025344 


Homo sapiens 


interleukin-1 receptor 
antagonist-like FIL1 theta 


809 


99 


1148 


gi 1668744 


Homo sapiens 


HHa5 hair keratin type I 


1114 


72 j 
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1148 
1148 

1149 


gi3724107 
gi4103158 

gi23271416 


I Homo sapiens 
1 Musmusculus 

1 Musmusculus 


type I hair keratin 5 
hair keratin acidic 5; Ha5 
keratin 

Leprel protein 


1114 
1116 

141 


72 
72 

30 


1149 
1149 

1150 


gi30582917 
gi6166378 

gi 16550754 


1 Homo sapiens 
Musmusculus 

1 Homo sapiens 


1 

AF165163J growth 
suppressor 1L 


139 
143 


30 
30 


1150 

1150 
1151 


gil699265 

gi27529955 
gil4595019 


1 Homo sapiens 

| Mus musculus 
Homo sapiens 


unnamed protein product 
malignant cell expression- 
enhanced gene/tumor 
progression-enhanced gene 
mRBI 

keratin 6 irs 


1337 
389 

1284 
1990 


90 
57 

86 
76 


1151 
1151 
1152 


gil8031724 
gi27901522 
gi 11066090 


I Musmusculus 
1 Homo sapiens 
1 Homo sapiens 


keratin protein K6irs 

keratin 6 irs3 
AF195192_1 matrix 
metalloprotease MMP-27 


1948 
2519 
2233 


75 
94 
84 


1152 
1152 


gil2006364 
gi35 11149 


I Tupaia belangeri 
Gallus gallus 


AF281673_1 matrix 
metalloproteinase-27 


1859 


71 


1153 


gil 1066090 


1 Homo sapiens 


matrix metalloproteinase 
AF195192_1 matrix 
metalloprotease MMP-27 


1213 
2233 


50 
84 


1153 
1153 


gi 12006364 
gi3511149 


Tupaia belangeri 
Gallus gallus 


AF281673_1 matrix 
metalloproteinase-27 
matrix metalloproteinase 


1859 
1213 


71 
50 


1154 
1154 


gi24710913 
gi5739507 , 


Homo sapiens 
Homo sapiens 


suppressor of fused 
AF175770_1 suppressor of 
fused 


2599 
2594 


100 
99 


1154 


gi6689894 


Homo sapiens 


AF159447_1 Suppressor of 
Fused 


2599 


100 


1155 


gi20387085 


Oncorhynchus 
mykiss 


-1 


680 


31 


1155 


ei2l667212 




Ar4o57oo_l 

bactericidal/permeability- 
increasing protein-like 2 


2600 


100 


1155 

1156 J 
1156 ! 


gi28 173296 

gil2082687 
gi24047297 


Cyprinus carpio 

Homo sapiens 
Homo sapiens 


bactericidal permeability- 
increasing 

protein/lipopolysaccharide- 

binding protein 

Sry-related HMG-box protein 


702 
2066 


31 
100 


1156 1 
1157 

1157 


gi8894593 
gil9526647 

gi21758574 


Homfk ^anipnc 

Homo sapiens 
Homo sapiens 


SRY-box 18 
oUAio protein 
AF462348J oxidored-nitro 
domain-containing protein 


2066 
2066 ! 
842 


100 
100 
92 


1157 | 


gi7303522 


Drosophila 

liicicUiUgaolCi 


unnamed protein product 
CG13178-PA 


922 
173 


97 
32 


1158 
1158 


gi 19526647 
gi2 1758574 


Homo sapiens 
Homo sapiens 


AF462348_i oxidored-nitro 
domain-containing protein 


842 


92 


1158 

1159 J 
1159 1 


gi7303522 [ 

gil79422i 
gil794223 I 


Drosophila 
melanogaster 
Mus musculus 
Mus musculus 


unnamed protein product 
CG13178-PA 

DNAligase Ill-beta 

DNA ligase Ill-alpha (_ 


922 
173 

2977 
2977 


97 
32 

89 
89 
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1 1 5Q 


mOG1«777 

gizy 10D /ZZ 


Mus musculus 


lirroco TTT FYNT A ATP 

ngase in, din a, Air- 

si an an /loti t 

aepenueni 


3U1U 


oo 
ay 


1160 


gi 1052871 


Homo sapiens 


squamous cell carcinoma 
ami gen z 


879 


45 




gl I joo/^iy 


nomo sapiens 


nr 1 * 1 1 iy I 1 OCJvr UNOIZ 


ZUOj 


yj 




gloo /40j 


nomo sapiens 


leupm 


o ly 




1163 


gi29611342 


Homo sapiens 


AF425650_1 MBD1- 
containing enromaun associateo 
factor 


352 


52 


1163 


gi7228149 


Mus musculus 


ATFa-associated factor 


357 


29 


1 1 61 

I iOj 


gl /jOj /UD 


Drosophila 
melanogaster 


r y n'\ a d a 
^OlZj4U-rA 


1 0*7 

lo / 


31 




ml 471 1 7QQ 
gl 14Z1 1 Dyo 


Homo sapiens 


Ar3ouj*fz_i caspase-oL* 


Z03 


1 AA 
1UU 


1 166 
1 100 


m 1 QAA1 ^77 

giiy^ui oz/ 


Homo sapiens 


procaspase-8 


ZZ3 


yj 


1 166 
1 100 


glZUjol^ZO 


Homo sapiens 


Similar to caspase 8, apoptosis- 
related cysteine protease 


Zo3 


i An 
100 


1 167 
1 10/ 




tromo sapiens 


rL/juuuou protein 


1 7A/1 


yo 


1 167 
I lO / 


gl^UnOOUo*f 


o os taurus 


Kiuer ceil lmmunogiooujin-iiice 
receptor KIR3DS1 


oUU 


jj 


1 167 
i 10 / 


glOU*fOOUOO 


rjos laurus 


killer cell immunoglobulin' like 
icucpior j\ii\jL/Li 


7Ci 




1168 




I? of+i 1 c n/w*»<rir»iic 
XUUlUo I1UI VGglUUa 


TIP 1 70 

1 li 1ZO 




QQ 
yy 


1168 


glZ>7 / 7Z> 1UU 


xiajiiiu odfJiciia 


T7P1 7fl nmfpin 
iir izu proicm 




QQ 
yy 


1168 


gi7688703 


Homo sapiens 


AF157326J TIP 1 20 protein 


4573 


99 


1169 


gil3016701 


Homo sapiens 


activating coreceptor NKp80 


1226 


100 




oi774AQR67 


lYidCdCd idscicuians 


in jvpou in iv receptor 


1 1 07 
1 1ZZ 


on 
y\) 


1 16Q 


gl / loo JO / 


Unmn com one 

nomo Sapiens 


API 75706 1 l*>^tin lilrp 

/vr i / jzuo_t ieciin-iiKe 
receptor Fl 


1776 
1ZZ0 


1UU 


1 171 
1 1 / 1 


ai7161Q1Qft 

gizioiy iy\j 


nomo sapiens 


-nice lA-unKeo 


77Q< 
Z/oD 


1UU 


1171 


gi3021409 


Homo sapiens 


like 1 protein 


3057 


100 


1171 


gi30353941 


Homo sapiens 


TBL1X protein 


3057 


100 


1 1 *7*) 
1 1 11 


giloyyzDj 


Homo sapiens 


malignant cell expression- 
enhanced gene/tumor 
progression-enhanced gene 


cn i 

671 


65 


1 1 *70 
1 1 /Z 


giz/jzyyjj 


Mus musculus 


moo I 


o4o 


67 


I 1 77 

I I /Z 


glJJjDDOy 1 


Homo sapiens 


transmembrane channel-like 
protein 4 


o4z 


100 


1173 


gil699265 


Homo sapiens 


malignant cell expression- 
enhanced gene/tumor 
progression-enhanced gene 


671 


65 


1 171 
1 1 ID 


m77 < \7QQ*\5 
glZ/DZySOj 


N if itn vmtnrtiiliin 

ivlus museums 


rnDDI 

moo i 


04o 


0/ 


1 1 71 
I I ID 


gljjjjjOy 1 


Homo sapiens 


transmembrane channel-like 
protein 4 


04z 


1 (\(\ 

1UU 


I 174 

I I /*t 


gl lOJ DU /34 


— r- : 

Homo sapiens 


unnamed protein product 


1 QQ1 


1 AA 

100 


1 17/1 


giioyyzoj 


Homo sapiens 


malignant cell expression- 

pnhanrpH pene/tumnr 

^UUullv vVJ glsilw IU111VI1 

progression-enhanced gene 




Q 1 
01 


1174 


gi27529955 


Mus musculus 


mBBl 


1810 


95 


1175 


gil3182755 


Homo sapiens 


AF212237 1 HPHRP 


1210 


100 


1175 


gi 15929309 


Homo sapiens 


Phosphotriesterase related 


1210 


100 


1175 


gi29791939 


Homo sapiens 


phosphotriesterase related 


1210 


100 


1177 


gil0047271 


Homo sapiens 


KIAA1598 protein 


789 


99 


1177 


gi22539701 


Mus musculus 


4930506M07Rik protein 


818 


96 


1177 


gi26349641 


Mus musculus 


unnamed protein product 


818 


96 
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1178 


gil4272704 


Homo sapiens 


unnamed protein product 


157 


96 


1178 




nomo sapiens 


Uuiidiiicu protein piv/uuui. 


164 


100 


1178 


gil9575655 


Homo sapiens 


unnamed protein product 


164 


100 


1182 


gil3377880 


Cricetulus 
longicaudatus 


ArJ30U4J_l arginine i>- 
methyltransferase p82 isoform 


39S3 


OJ 


1182 


gil3377882 


Cricetulus 
longicaudatus 


Ar3 j0Uh4_1 arginine in- 

meinyiUdllolClabC p/ / lsuiunn 




o J 


1182 


gil3879453 


Mus musculus 


cDNA sequence BC006705 


3260 


85 


1183 


_ « t a A*> A CI A 

gi 14424574 


Homo sapiens 


AAnuy^lD pnospnauayibenne 
decarboxylase 


ill 


i no 


1183 


gl 163066 18 


Homo sapiens 


AAnUi4oZ pnospnauuyiocrinc 
aecarooxyiase 


1918 




1183 


giiy i iod 


L^ncetuius griseus 


pnospndiiayiseriiic 
Qccdru o a yi dbe 


1128 


88 


1 1 ox 

1184 


gllUUoozjJ 


— : 

Homo sapiens 


nr1ii/*/\f»rtt*f'ip/M/l— ITlfillPPH IrTT x 

^lueuuui iiu*jiu"iiiuuvs/u. xjii^iju 


460 


98 


1 1 OA 

1184 


glliyU/joU 


Mus musculus 


AR9019RQ 1 T<%P92-related 
inHnrtKlp Ipnrmp 7inner 3c 


891 


87 


1 1 QA 
1 184 


goyiyioi 


nomo bd.pi cub 


AF1 83393 1 TSC-22-like 

AI 10JJ7J 1 I tJV *».*» niw 

Protein 


460 


98 


1185 


gi 13874437 


Homo sapiens 


cerebral protein- 11 


1457 


68 


1 1 OJ 


glZU^O / JnH 


lVlUo IllUbOUlUb 


T OP91 9Q04 nrotein 


3064 


89 


1185 


gi24980850 


Homo sapiens 




3283 


100 


1 1 R£ 
1 100 


gl l*tUj J7 / O 


nomo bapiciib 


unnflmftH nrotpin nroduct 


2577 


100 


1186 


gil4272784 


Homo sapiens 


unnamed protein product 


2577 


100 


1186 


giloyzJij 1 


Homo sapiens 


AF9H4970 1 RHRP-3S 


1431 


99 


1 1 0*7 

1187 


giloo/ooou 


Homo sapiens 


r L/Juu LLy proiein 


Q30 
yj\j 


97 


1187 


gil9343701 


Mus musculus 


RIKEN cDNA A630054L15 


913 


93 


1187 


gi25955706 


Homo sapiens 


oimiiar to nypoinencai protein 
MGC38041 




Q7 
y i 


1188 


gil7865311 


Homo sapiens 


Ar4jziuz_i aipeptiayi 
pepiiaase-*iiKe proiein y 


HOHO 


L\J\J 


1188 


gi27549552 


Homo sapiens 


dipeptidyl peptidase IV-related 
proicin-z 


4646 


100 


1188 


gi29293087 


Homo sapiens 


dipeptidyl peptidase 9 


4787 


99 I 


1189 


gil78653l I 


Homo sapiens 


Ar4jziuz_i aipeptiayi 
pcpciadbc-iiivc pruiLin y 


43 84. 


OS 

yj 


1 1 on 

1189 


glZ/D450jZ 


— : 

Homo sapiens 


A\r\c*r\i~iA\r\ npntiHocp T\7-rpl^fprl 

aipepiiuyi pepuuabe i v -i ciaLcu 

nrr\fpin«9 
plUieitl'^ 


4384 


95 


1 1 QO 

1 loy 


gizyzyjuo / 


nomo sapiens 


rlinpntiHvl npntifln<;p 0 


4525 


95 


1 1 Oft, 




nomo sapiens 


npntiHa^p-lilce nrotein Q 


4551 


98 


1 1 go 
I ly\J 


glZ/ JHyjDZ 


nomo odpiena 


HinpntiHvl nentida^p TV-relflted 
ni*ntpin-9 

pi UlClll i 


4551 


98 


1190 


gi29293087 


Homo sapiens 


dipeptidyl peptidase 9 


4692 


98 


l lyi 


* 1 o f\(\1/ZA1 

giiiuy /o4Z 


Homo sapiens 


JCvlUUbUIIlal piUlCill OZ.J 


554 


99 


1191 


eil3279l49 


Homo sapiens 


Ribosomal protein S25 


554 


99 


1191 


gil3436422 


Homo sapiens 


Ribosomal protein S25 


554 


99 


1192 


gi 16549206 


Homo sapiens 


unnamed protein product 


680 


100 


1193 


gi21756739 


Homo sapiens 


unnamed protein product 


4771 


97 


1193 


gi6453538 


Homo sapiens 


hypothetical protein 


4159 


99 


1193 


gi6634025 


Homo sapiens 


KIAA0379 protein 


3467 


67 


1194 


gi 12652695 


Homo sapiens 


AAH00096 HtrA-like serine 
protease 


2116 


93 


1194 


gi5870865 


Homo sapiens 


serine protease 


2116 


93 
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1194 


gi7672669 


Homo sapiens 


AF 14 1 3 05_1 serine protease 
Htra2 


2116 


93 


1195 


gi!387985 


Homo sapiens 


A3 adenosine receptor 


904 


100 


1195 


gi20988265 


Homo sapiens 


adenosine A3 receptor 


904 


100 


1195 


gi22658481 


Homo sapiens 


adenosine receptor A3 


904 


100 


1196 


gi24078514 


Mus museums 


AF454954 1 crossveinless-2 


988 


91 


1196 


gi328 16043 


Mus musculus 


BMP-binding endothelial 
regulator precursor protein 


988 


91 


1196 


gi32892l46 


Homo sapiens 


crossveinless-2 


1085 


100 


1197 


gil8479346 


Mus musculus 


olfactory receptor MOR101-1 


1334 


82 


1197 


gil8480772 


Mus musculus 


olfactory receptor MOR101-2 


1415 


84 


1197 


gi32054443 


Mus musculus 


olfactory receptor 
GA X6K02T2PBJ9-2443810- 
2444775 


1415 


84 1 


1198 


gi 16502 169 


Salmonella enterica 
subsp. enterica 
serovar Typhi 


putative DNA methylase 


751 


93 


1198 


gi29137981 


Salmonella enterica 
subsp. enterica 
serovar Typhi Ty2 


putative DNA methylase 


751 


93 


1198 


gi498768 


Serratia marcescens 


Deoxyadenosyl- 
methyltransferase 


330 


51 


1199 


gil213589 


Xenopus laevis 


Prostaglandin D Synthase 


290 


33 


1199 


gi 16974751 


Gallus gallus 


CALII 


335 


37 


1199 


gi666121 


Xenopus laevis 


cpl-1 


291 


33 


1200 


gi20987993 


Mus musculus 


MGC41336 protein 


1212 


90 


1200 


gi22296200 


Thermosynechococc 
us elongatus BP-1 


asparaginyl-tRNA synthetase 


1046 


46 


1200 


gi32448516 


Pirellula sp. 


asparaginyl-tRNA synthetase 


1034 


47 


1201 


gi20067381 


Homo sapiens 


ALMS1 protein 


242 


41 


1201 


gi21552774 


Mus musculus 


AF425257J Almstrom 
syndrome 1 protein 


217 


38 


1201 


gi32693320 


Homo sapiens 


ALMS1 protein 


242 


41 


1202 


gi!265506l 


Homo sapiens 


AAH01380 


495 


92 


1202 


gi23574788 


Macaca fascicularis 


succinate dehydrogenase 
flavoprotein subunit 


502 


93 


1202 


gi5759173 


Homo sapiens 


succinate dehydrogenase 
flavoprotein subunit 


495 


92 


1203 


gi21928186 


Mus musculus 


GPl-gamma 4; GPIgamma4 


1466 


61 


1203 


gi21928188 


Mus musculus 


GPI-gamma 4; GPIgamma4 


1466 


61 


1203 


gi30931171 


Mus musculus 


GPlgamma4 protein 


1466 


61 


1204 


gil 50823 11 


Homo sapiens 


AAH12061 -binding protein 3 


1534 


92 


1204 


gi9957161 


Mus musculus 


AF176327 1 alphaCP-3 


1708 


99 


1204 


gi9957165 


Homo sapiens 


AF176329_1 alphaCP-3 


1722 


100 


1205 


gil4574118 


Caenorhabditis 
elegans 


Dumpy : shorter than wild-type 
protein 19 


233 


31 


1205 


gi 16553246 


Homo sapiens 


unnamed protein product 


881 


99 


1205 


gi21739662 


Homo sapiens 


hypothetical protein 


830 


95 


1206 


gil2653341 


Homo sapiens 


AAH00439 beta 


1742 


94 


1206 


gi 12804943 


Homo sapiens 


AAH01924 beta 


1742 


94 


1206 


gi3107l 


Homo sapiens 


E-l beta subunit of the 
pyruvate dehydrogenase 
complex 


1742 


94 


1207 


gil64851 


Oryctolagus 


calsequestrin precursor 


1908 


94 
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cuniculus 








1207 


gi2618621 


Mus musculus 


skeletal muscle calsequestrin 


1938 


94 


1207 


gi688292 


Homo sapiens 


calmitine; calsequestrine 


2029 


100 


1209 


gil0432376 


Homo sapiens 




3334 


99 


1209 


gil 1034760 


Homo sapiens 


NIB AN 


3692 


99 | 


1209 


gil2620192 


Homo sapiens 


AF288391 1 Clorf24 


4775 


99 


1210 


gi2982508 


Homo sapiens 


TCR beta chain 


1290 


93 


1210 


gi3002925 


Homo sapiens 


T cell receptor beta chain 


1277 


93 


1210 


gi3089433 


Homo sapiens 


T cell receptor beta chain 


1028 


75 


1211 


gil2006041 


Homo sapiens 


AF267857 1 AD038 


761 


98 


1211 


gil4189960 


Homo sapiens 


AF305818 1 PRO0764 


141 


53 


1211 


gi33338042 


Homo sapiens 


AF173896 1 MSTP121 


143 


46 


1213 


gi 17939498 


Homo sapiens 


AAH19299 protocadherin 
gamma subfamily C, 3 


4777 


99 


1213 


gi20072790 


Homo sapiens 


protocadherin gamma 
subfamily C, 3 


4777 


99 


1213 


gi2995719 


Homo sapiens 


protocadherin 43 


4792 


100 


1214 


gi 12803363 


Homo sapiens 


CALR protein 


1747 


99 


1214 


gil8088117 


Homo sapiens 


AAH20493 calreticulin 


1747 


99 


1214 


gi30583735 


Homo sapiens 


calreticulin 


1747 


99 


1215 


gi200962 


Mus musculus 


serine 1 ultra high sulfur 
protein 


254 


38 


1215 


gi200964 


Mus musculus 


serine 2 ultra high sulfur 
protein 


299 


43 ' 


1215 


gi3228237 


Homo sapiens 


ultra high sulfer keratin 


248 


36 


1218 


gi 17223709 


Homo sapiens 


selenoprotein SelM 


235 


100 


1218 


gil7223711 


Mus musculus 


selenoprotein SelM 


188 


78 


1218 


gi26351995 


Mus musculus 


unnamed protein product 


162 


76 


1221 


gil001963 


Homo sapiens 


osteopontin 


1400 


90 


1221 


gil89151 


Homo sapiens 


nephropontin precursor 


1400 


90 


1221 


gi992950 


Homo sapiens 


OPN-c 


1426 


98 


1222 


gi!4326586 


Homo sapiens 


AF386078_1 serine-cysteine 
proteinase inhibitor clade C 
member I 


2252 


95 


1222 


gil79130 


Homo sapiens 


antithrombin III 


2252 


95 


1222 


gi583741 


synthetic construct 


Antithrombin III 


2252 


95 ! 


1223 


eil8088363 


Homo sapiens 


AAH20669 advanced 
glycosylation end product- 
specific receptor 


2004 


99 


1223 


gil841550 


Homo sapiens 


AAB47491 receptor for 
advanced glycosylation end 
products 


2004 


99 


1223 


gi561659 


Homo sapiens 


receptor of advanced 
glycosylation end products of 
proteins 


2004 


99 


1224 


gil3359193 


Homo sapiens 


KIAA1660 protein 


598 


100 


1225 


gi37231 


Homo sapiens 


DNA topoisomerase II 


8061 


99 


1225 


gi3869382 


Homo sapiens 


DNA topoisomerase II beta 


8048 


99 


1225 


gi790988 


Cricetulus 
longicaudatus 




7886 


97 


1226 


gil881713 


Rattus norvegicus 


fatty acid transport protein 


3039 


87 


1226 


gi20810561 


Mus musculus 


, member 1 


3031 


87 


1226 


gi563829 


Mus musculus 


fatty acid transport protein 


3031 


87 


1227 


gil5080010 


Homo sapiens 


AAH11789 Similar to COP9 


503 


44 
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comnlex ^ubunit 7a 






1227 


gil5215085 


Mus musculus 


Cops7b protein 


885 


71 


1997 
1ZZ/ 


gu>3uyi /o 


1V1US HlUSCULUb 




888 


71 


1998 
1ZZO 


m *1 CA9^1 
gliOvZO 1 


T-Tr\rv» r\ pom pno 

nomo sdpictici 




544 


58 


199C 
iZZo 


m AO/19 ft Q£ 


mus tnuscutus 




Zf JO 


01 


1 O90 


gioy4Zuyo 


Mus musculus 




01 8 




1229 


gil5620819 


Homo sapiens 


KIAA1880 protein 


2851 


99 


1229 


gll7ooiy52 


Drosophila 
melanogaster 


t nm o>i7n 
LfUUiy*t/p 


11R9 
1 joZ 


sri 


1229 


gi7291183 


Drosophila 
melanogaster 


CG1826-PA 


1382 


50 


1230 


gi21756739 


Homo sapiens 


unnamed protein product 


2878 


58 


1230 


gi26354957 


Mus musculus 


unnamed protein product 


5453 


95 


1230 


gi6634025 


Homo sapiens j 


KIAA0379 protein 


3166 


57 ! 


1231 


gi20387085 


Oncorhynchus 
mykiss 


-1 


ooz 


31 


1231 


gi2 16672 12 


Homo sapiens 


AF465766_1 
bactericidal/permeability- 
increasing protein-like 2 


Z3 84 


no 


1231 


gi28 173296 


Cyprinus carpio 


bactericidal permeability- 
increasing 

protein/lipopolysaccharide- 
binding protein 


/"OA 

680 


31 


1232 


gi20387085 


Oncorhynchus 
mykiss 


-1 


654 


31 


1232 


gi2 16672 12 


Homo sapiens 


AF465766_1 
bactericidal/permeability- 
increasing protein-like 2 


ii on 
z3o9 


no 
yy 


1232 


gi28 173296 


Cyprinus carpio 


bactericidal permeability- 
increasing 

protein/lipopolysaccharide- 
binding protein 


o/Z 


30 


1233 


gi20387085 


Oncorhynchus 
mykiss 


-1 


688 


31 


1233 


gi21667212 


Homo sapiens 


A T7A ^CT//' 1 

AF465766_1 

bactericidal/permeability- 
increasing protein-like 2 


2595 


yy 


1233 


gi28 173296 


Cyprinus carpio 


bactericidal permeability- 
increasing 

protein/lipopolysaccharide- 
binding protein 


710 


31 


1234 


gllozj /341 


-— 

Mus musculus 


Expressed sequence 

A. W UOUZU / 


91 

Z 1UO 


yjy 


1234 


gi2191168 


Arabidopsis thaliana 


contains similarity to myosin 
heavy chain 


207 


26 


111 A 

1234 


glzo /yoU4 


r 

S chizosaccharomyce 


CD ATOl A 1 1 fin 


1 

IOj 


zo 


1235 


gi 11493528 


Homo sapiens 


AF130U7 71 PR01953 


671 


100 


1236 


gi21754036 


Homo sapiens 


unnamed protein product 


998 


99 


1236 


gi30411057 


Mus musculus 


RIKEN cDNA B230219D22 


954 


93 


1236 


gi31565787 


Homo sapiens 


FLJ37562 protein 


1002 


100 


1237 


gi27469556 


Homo sapiens 


Putative neuronal cell adhesion 
molecule 


3516 


99 


1237 


gi3068592 


Mus musculus 


punc 


2976 


86 


1237 


gi4206390 


Homo sapiens 


putative neuronal cell adhesion 


1569 


98 
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molecule 






1238 


gil2667401 


Homo sapiens 


AF326731J NUF2R 


2347 


99 


1238 


oil 43 17902 


Homo ^aniens 


kinetochore protein Nuf2 


2347 


99 


1238 


gil 8043223 


Mus musculus 


NUF2R protein 


1754 


73 




oi 1 04^ 5403 


XnUlltU 04JJ1G113 


nnnampH Tvrntpin nroduct 

uiiiiaiii&u jjiwtt/iii piuuui'V 


2702 


99 


1930 


<n7099001 


TTrwnrk c^ni pr>Q 

JTJ.U1I1U o<X(JlClld 


unnamed nrotein nroduct 

UllllCLlll^U JJlVJtwLll piUUUWl 


3682 


99 


1930 


ot76R8176 


f-Trtmn cdfYipnQ 

11AJ111U OdplCllo 


hvnnthetieal nrotein 


3688 


99 




cri91 6^4893 


1-T/vni*\ caniPTic 
ilUlllU odpiCJio 


AF380428 1 <?emanhorin 6D 

xVi. jo7t^o i a&iiicijjinji in ul/ 

isoform 3 


5142 


91 


1240 


gi2 1634825 


Homo sapiens 


AF389429_1 semaphorin6D 
isoform 4 


5667 


98 


1240 


ei2 1634827 


Homo saniens 


AF3 89430 1 semaphorin 6D 
isoform 1 


3112 


63 


1941 


ail 4036200 


Homo sanipns 


unnamed orotein Droduct 

HI 111 ill 1 1 V*rVl pi V/tvllI pi W*Wv b 


245 


97 


1943 


pi91671 105 

gl<£> 1U / 1 1VJ 


fTnmn sanipn<j 


RAD52B 


1134 


ioo ! 


1943 


pi934fi8359 


T-Tnmn ^htmptiq 


Similar to RAD52B 


963 


99 


1943 


ai 32067621 


IV/f i]o iriii^cilllIS 


2410008M22Rik orotein 

X, \M W UlflXiA*lVllV 1/1 VlrVll* 


828 


74 


1244 


gil5928404 


Mus musculus 


Fasting-inducible integral 
membrane nrotein TM6P1 


185 


36 


1244 


gil8490578 


Mus musculus 


A630041N19 protein 


449 


71 


1944 


pi90370096 


lMiic mucf nine 


Fastinp-indnriHIe inteoral 

X doling lllU.Ul/lU.ll' Ullvgiai 

membrane protein TM6P1 


185 


36 


1945 


pi 18400578 


MllQ miUJPllltlQ 
lvxud uiuovuiuo 


A630041N19 nrotein 


875 


70 


1945 


cH90707?29 


Unmn cnnipne 


FL 190024 nrotein 


297 


33 


1245 


gi6013381 


Rattus norvegicus 


AF186469 1 TM6P1 


296 


33 


194£ 




nOulU bdpicilo 


r , n1rMiim-"nprmP5iniP Qtnrp- 

V£Hv>lLllll UCi lll^ilLJlw OLVJIW 

nnpratpH f*h;innpf TRPN/13p 

UpCLalCU 1 l\jf 1VIJW 


1194 


100 


1946 


ot'98696953 


J-T/-\m r\ canipnc 


palrinm-nermeahle store- 

operated channel TRPM3d 


1194 


100 


1246 


gi28626255 


Homo sapiens 


calcium-permeable store- 
operated channel TRPM3e 


1194 


100 


1247 


pi 17386053 

£l I / JOUUJ J 


A/Tiie musenlus 

IriUo lUUavUlUJ 


AF444274 1 Jedi orotein 


2269 


50 


1247 


pi 18044366 


Homo saniens 


AAH20198 Similar to 
MEGF 10 protein 


3468 


99 


1247 


gil8252658 


Mus musculus 


AF461685 1 Jedi-736 protein 


2269 


50 


1248 


gi20987880 


Mus musculus 


E130103I17Rik protein 


3580 


87 


1948 


pi'? 82049 17 


lV/Tno mil genius 


E130103I17Rik orotein 

±J IJV 1 Vt/11 / 1\1 IV Ih/1 \J tvl 1 1 


3801 


86 


1248 


gi4588087 


Homo sapiens 


AF095771_1 PTH-responsive 
osteosarcoma B 1 orotein 


4080 


94 


1940 


pil 3501 434 


Homo ^anipns 

llVJUlvs dap! Wild 




1160 


100 


1940 


pi13501435 


Homo oaniprm 

IlWinvj oapl&llo 




976 


99 


1940 


pil 001 3471 


T-Tntnn eanipn*! 
iiumu oa^jiviio 




1265 


99 


1250 


gil6605581 


Homo sapiens 


H-revl07-like protein 5 


1451 


100 


i9^n 


ai91 7070RO 


nuillU bajJIOUo 


Similar to H-rev107-like 

OllllllaL WJ 11 IvV Iv f HIWv 

protein 5 


1382 


96 


1250 


ei6048565 

UlvvTww V#«/ 


Homo saDiens 


AF092922 1 retinoid inducible 
gene 1 


376 


54 


1251 


gi21263094 


Rattus norvegicus 


AF5 12430 1 tramdorin I 


1665 


81 


1251 


gi27924388 


Mus musculus 


Tramdorin 1 


1668 


82 


1251 


gi3 1871293 


Homo sapiens 


proton/amino acid transporter 2 


2010 


99 


1252 


gil4571904 


Rattus norvegicus 


AF361239__1 lysosomal amino 
acid transporter 1 


1931 


78 


1252 


gi3 1324239 


Homo sapiens 


proton-coupled amino acid 
transporter 


2174 


90 
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1252 


gi3 1871291 


Homo sapiens 


proton/amino acid transporter 1 


2195 


90 


1254 


gil563885 


Homo sapiens 


fibroblast growth factor 
homologous factor 1 


917 


90 


1254 


gil669500 


Mus musculus 


fibroblast growth factor 
homologous factor 1 


917 


90 


1254 


gi20988932 


Mus musculus 


Fgfl2 protein 


916 


98 


1255 


gil9263005 


Ciona intestinalis 


leucine-rich repeat dynein light 
chain 


759 


75 


1255 


gi276016l 


Anthocidaris 
crassispina 


outer arm dynein light chain 2 


658 


67 


1255 


gi730390i 


Drosophiia 
melanogaster 


CG8800-PA 


554 


58 


1256 


gil2666529 


Mus musculus 


b^-carotene^lO'- 
dioxygenase 


2356 


80 


1256 


gi 12666531 


Homo sapiens 


putative bjb-carotene-^lO 1 - 
dioxygenase 


2982 


99 


1256 


gil4582265 


Homo sapiens 


AF276432_1 putative carotene 
dioxygenase 


2918 


99 i 


1257 


gi 12666529 


Mus musculus 


b,b-carotene-9 f ,10- 
dioxygenase 


2305 


81 


1257 


gil2666531 


Homo sapiens 


putative bjb-carotene^lO- 
dioxygenase 


2850 


96 


1257 


gi 14582265 


Homo sapiens 


AF276432_1 putative carotene 
dioxygenase 


2786 


95 


1258 


gi 15559697 


Homo sapiens 


AAH 14205 Similar to neural 
cell adhesion molecule 1 


157 


28 


1258 


gi28703938 


Homo sapiens 


Similar to neural cell adhesion 
molecule 1 


157 


28 


1258 


gi61 


Bos taurus 


calmodulin-independent 
adenylate cyclase 


158 


28 


1260 


gil079734 


Mus musculus 


citron 


1291 


94 


1260 


gi2745840 


Rattus norvegicus 


postsynaptic density protein; 
citron 


1262 


93 


1260 


gi3599509 


Mus musculus 


rho/rac-interacting citron 
kinase 


1286 


94 


1261 


gi28277755 


Danio rerio 


proteinase inhibitor, clade E, 
member 2 


479 


30 


1261 


gi28435507 


Sus scrofa 


nexin-1 


467 


30 


1261 


gi32485107 


Homo sapiens 


nexin-related serine protease 
inhibitor 


2002 


92 


1262 


gi 13383364 


Homo sapiens 


claudin-1 


223 


97 


1262 


gil5214678 


Homo sapiens 


AAH12471 claudin 1 


223 


97 


1262 


gi7381083 


Homo sapiens 


AF134160 1 claudin-1 


223 


97 


1263 


gil3542685 


Mus musculus 


SARI a gene homolog 


441 


54 


1263 


gi21634445 


Homo sapiens 


AF274026J GTP-binding 
protein Sara 


446 


57 


1263 


gi33 150636 


Homo sapiens 


AF087897J GTP binding 
protein 


446 


57 


1264 


gi22902436 


Mus musculus 


Sphingosine- 1-phosphate 
phosphatase 1 


717 


38 


1264 


gi23345324 


Homo sapiens 


sphingosine 1-phosphate 
phosphohydrolase 2 


2073 


100 


1264 


gi29436890 


Mus musculus 


Similar to sphingosine- 1- 
phosphate phosphotase 2 


1624 


80 


1265 


gi!4 


Bos taurus 


BoWCLl 


1214 


39 
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1265 


gil480365 


Sus scrofa 


scavenger-receptor protein 


1327 


42 


1265 


gi27464818 


Mus musculus 


scavenger receptor cysteine- 
rich type 1 protein CD 163c- 
alpha precursor 


1339 


44 


1266 


gil4 


Bos taurus 


BoWCl.l 


1214 


39 


1266 


gil480365 


Sus scrofa 


scavenger-receptor protein 


1327 


42 


1266 


gi27464818 


Mus musculus 


scavenger receptor cysteine- 
rich type 1 protein CD 163c- 
alpha precursor 


1339 


44 


1268 


gi21619491 


Homo sapiens 


similar to expressed sequence 
AW049604 


778 


100 


1268 


gi32967233 


Homo sapiens 


TAFA4 


778 


100 


1268 


gi32967245 


Mus musculus 


TAFA4 


698 


93 


1270 


gil8033185 


Danio rerio 


AF330001_1 UNC45-reIated 
protein 


3100 


73 


1270 


gi27436424 


Mus musculus 


striated muscle UNC45 


3937 


94 


1270 


gi27436426 


Homo sapiens 


striated muscle UNC45 


4092 


99 


1271 


gi21064657 


Drosophila 
melanogaster 


RH01479p 


182 


39 


1271 


gi28375475 


Homo sapiens 


unnamed protein product 


639 


99 


1271 


gi7304173 


Drosophila 
melanogaster 


CG1577-PA 


182 


39 


1272 


gil6876958 


Homo sapiens 


AAH 16754 hypothetical 
protein MGC12217 


410 


100 


1273 


gi!5823642 


Homo sapiens 


ALS2CR7 


2038 


100 


1273 


gi32485022 


Homo sapiens 


serine/threonine protein kinase 


2038 


100 


1273 


gi32485027 


Homo sapiens 


serine/threonine protein kinase 


2320 


100' 


1274 


gi 12654893 


Homo sapiens 


AAH01291 


400 


97 


1274 


gi2407911 


Homo sapiens 


C016 


714 


96 


1274 


gi6733554 


unidentified 


unnamed protein product 


710 


96 


1275 


gil8147612 


Homo sapiens 


metalloprotease disintegrin 


4434 


95 


1275 


gi21908028 


Homo sapiens 


AF466287_l a disintegrin and 
metalloprotease domain 33 


4434 


95 


1275 


gi21908O3O 


Homo sapiens 


a disintegrin and 
metalloprotease domain 33 


4434 


95 


1276 


gi 1655 1401 


Homo sapiens 


unnamed protein product 


2735 


100 


1276 


gi4972116 


Arabidopsis thaliana 


putative proline-rich protein 


133 


44 


1276 


gi7269638 


Arabidopsis thaliana 


putative proline-rich protein 


133 


44 


1277 


gil5291913 


Drosophila 
melanogaster 


LD31582p 


204 


23 


1277 


gi22477165 


Homo sapiens 




2783 


100 


1277 


gi26326895 


Mus musculus 


unnamed protein product 


1752 


69 


1278 


gi3452275 


Pseudopleuronectes 
americanus 


aminopeptidase N 


1008 


37 


1278 


gi525287 


Sus scrofa 


aminopeptidase N. 


1014 


38 


1278 


gi544755 


Oryctolagus 
cuniculus 


aminopeptidase N; APN 


1021 


37 


1279 


gil3559063 


Homo sapiens 




747 


100 


1279 


gi24416538 


Mus musculus 


1700001D09Rik protein 


708 


71 


1279 


gi9963863 


Homo sapiens 


AF226731 1 AD026 


738 


98 


1281 


gi208 10533 


Homo sapiens 


hypothetical gene supported by 
AK054745; AK054745; 
AK054745; AK054745 


414 


100 


1282 


gi208 10533 


Homo sapiens 


hypothetical gene supported by 


795 


100 
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AK054745; AK054745 






1282 


ei26345254 


N/fni mii<5f*nliiQ 


unnamea proiein proauct 


367 


63 


1282 


gi33244011 


Mus musculus 




374 


64 


1283 




nun 10 bdpicnb 


hypothetical gene supported by 
AK054745; AK054745; 


789 


99 


1283 


d263452S4 


AA11C tniicfi tine 
IVlUb III UM/ til US 


unnamed protein product 


396 


64 


1283 


«i ^94401 1 


iviuo niuscuius 




403 


65 


1284 


gil8447388 


Drosophila 

lUCIallOgablCr 


RE05944p 


700 


31 


1284 


gi21645210 


Drosophila 

inciaUOgaSLcr 


CG30394-PA 


700 


31 


1284 




T"^ iv* c A r\ n J 1 0 

Lviubupruid 

m plnnn era ct Ar 
lllClallUgdolCl 




700 


31 


1285 


ei 14035874 




unnamea protein proauct 


yio 


99 


1285 


eil4035876 


nvjiiiw oajjicilo 


unnamea proiem proauct 




99 


1285 


zi20070842 


1 LKJll ivj oaptdio 


similar to nypotneucai protein 
FTJ13448 


yy/ 


99 


1286 


ei 19070822 


Mus musriilim 


ArjDHouo 1 iviyo proiem 
P42POP 






1286 


ei20977688 


Xenor>u<? 1aevi<» 




1 A/C 


33 


1286 


gi27881626 


Homo sapiens 


LOC339344 protein 


150 


25 


1287 


eil0433236 


Honin canipnc 

1 l\Jl Vl\J odpicilo 


unnamea protein proauct 




99 


1288 


gil3278415 


Mus musculus 


cDNA sequence BC004018 


2402 


98 


1288 


ei26355239 


A/Tiiq tniicptiluc 

IVIUO JIlUoWUlUo 


unndmea protein proauct 




97 


1288 


gi30354720 


Mus musculus 


AI427653 protein 


1357 


57 


1289 


ei 12698037 


numu od.pi ens 


ruiAAi /4o protein 


5541 


100 


1289 


gil6769274 


Drosophila 
mcidnogasier 


LD22423p 


210 


24 


1289 




nrncAnn tin 

L/rosopmia 

tn p 1 ck n n era ot#»r 
1 1 1C 1 Oi LUgdb Lcr 




214 


24 


1290 


ffi21391484 


Hnmn csnipnc 
11UU1U oapiCllo 


— i 1 ~ u j : 

ieucine-rich repeat domain- 

t/uriidiiiing protein 


397 


39 


1290 


gi21391486 


Mus miKrnlii^ 


icuoinc-i ityii repeat uomain- 
containing protein 


All 


/in 
4U 


1290 


gi2 1623740 


Rattus norvegicus 


Leucine-rich repeat-containing 
protein 3 


428 


40 


1291 


ei20269073 


1.1U111VJ odLllCllo 


putative npia Kinase 


ZUUO 


7o 


1291 


ei2 1624340 


liAJlllU odpiCIIo 


ceramide kinase 


2006 


76 


1291 


ri2 1624342 


XVlUo lllUovUlUa 


ceramide kinases 


1617 1 


64 


1292 


ei3 12590 


IVluo lUUovulub 


biliary glycoprotein 


193 


32 


1292 


0135401 59 


nuino sdpiens 


KjLy IZh I 


175 


31 | 


1292 


m 74 14626 


xvdttus norvegicus 


carcinoembryonic antigen- 
related cell adhesion molecule, 
secreted isoform CEACAMla- 
4C1 


176 


31 


1293 


gi 1197500 


Homo sapiens 


T-cell surface antigen 


182 


22 


1293 


gi21707370 


Homo sapiens 


, sheep red blood cell receptor 


182 


22 


1293 


gi3 12590 


Mus musculus 


biliary glycoprotein 


193 


32 


1294 


gil8676564 


Homo sapiens 


FU00179 protein 


993 


99 


1294 


gi214U450 


Mus musculus 


C230093N12Rik protein 


1159 


91 


1294 


gi28839684 


Homo sapiens 


Similar to expressed sequence 
AI426465 


1242 


99 
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1295 


gi27923578 


Mus musculus 


cerebellin 4 precursor 


970 


96 


1295 


gi334 16458 


Mus musculus 


Cerebellin 2 precursor protein 


725 


73 


1295 


gi7708438 


Homo sapiens 




1020 


100 


1296 


gil8490912 


Homo sapiens 


neurotensin receptor 2 


1950 


93 


1296 


gi23 138725 


Homo sapiens 


Similar to neurotensin receptor 
2 


1984 


99 


1296 


gi3901028 


Homo sapiens 


neurotensin receptor 2 


1955 


93 


1297 


gil5077861 


Mus musculus 


AF396877J bullous 
pemphigoid antigen 1-e 


11308 


84 


1297 


gil79519 


Homo sapiens 


bullous pemphigoid antigen 


10559 


98 


1297 


gi403124 


Homo sapiens 


bullous pemphigoid antigen 


13047 


97 


1298 


gil5077861 


Mus musculus 


AF396877J bullous 
pemphigoid antigen 1-e 


11308 


84 


1298 


gil79519 


Homo sapiens 


bullous pemphigoid antigen 


10559 


98 


1298 


gi403124 


Homo sapiens 


bullous pemphigoid antigen 


13047 


97 


1299 


gi27469519 


Homo sapiens 


Similar to KIAA0476 gene 
product 


1506 


100 


1299 


gi30268290 


Homo sapiens 


hypothetical protein 


1506 


100 


1299 


gi33330327 


Homo sapiens 


c-MYC promoter-binding 
protein IRLB 


1501 


100 


1300 


gi 15929770 


Mus musculus 


expressed sequence 
AW049604 


666 


100 


1300 


gi32967235 


Homo sapiens 


TAFA5 


666 


100 


1300 


gi32967247 


Mus musculus 


TAFA5 


666 


100 


1301 


gil6041156 


Macaca fascicularis 


X-ray radiation resistance 
associated 1 protein 


729 


95 


1301 


gi 18676652 


Homo sapiens 


FLJ00225 protein 


779 


100 


1301 


gi33 150874 


Homo sapiens 


AF439934_1 unknown 


779 


100 


1302 


gi 16041 156 


Macaca fascicularis 


X-ray radiation resistance 
associated 1 protein 


411 


93 


1302 


gi 18676652 


Homo sapiens 


FLJ00225 protein 


444 


97 


1302 


gi33 150874 


Homo sapiens 


AF439934 1 unknown 


444 


97 


1303 


gi21619156 


Homo sapiens 


somatostatin 


226 


100 ! 


1303 


gi338288 


Homo sapiens 


preprosomatostatin I 


226 


100 


1303 


gi342299 


Macaca fascicularis 


preprosomatostatin 


226 


100 


1304 


gi22761332 


Homo sapiens 


unnamed protein product 


2052 


82 


1304 


gi24981080 


Mus musculus 


1 8 l0005H09Rik protein 


1103 


55 


1304 


gi33417011 


Mus musculus 




2037 


93 


1305 


gi22761332 


Homo sapiens 


unnamed protein product 


3143 


100 


1305 


gi26331032 


Mus musculus 


unnamed protein product 


2468 


81 | 


1305 


gi33417011 


Mus musculus 




2453 


85 


1306 


gi21744725 


Homo sapiens 


AF478693_l glycosyl- 
phosphatidyl-inositol-MAM 


1541 


48 


1306 


gi25005320 


Sus scrofa 


glycosylphosphatidylinositol 
anchor 1 protein 


1536 


48 


1306 


gi33 149988 


Homo sapiens 


MAM domain containing 1 


3035 


100 


1307 


gi 16550524 


Homo sapiens 


unnamed protein product 


799 


100 


1308 


gi20379980 


Mus musculus 


2410021P16Rik protein 


1731 


44 


1308 


gi22137453 


Mus musculus 


2410021P16Rik protein 


1734 


44 


1308 


gi28280023 


Mus musculus 


5730439E10Rik protein 


3348 


80 


1309 


gi20379980 


Mus musculus 


2410021P16Rik protein 


1634 


42 


1309 


gi22 137453 


Mus musculus 


24 1 002 lP16Rik protein 


1637 


43 


1309 


gi28280023 


Mus musculus 


5730439E10Rik protein 


3226 


78 


1310 


gi 19070 124 


Mus musculus 


AF233346 1 zinc transporter- 


1087 


95 
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u aim c 


Ivl veil tag C 












Tdentitv 








like 3 protein 






1310 


gi20563194 


Mus musculus 


AF395840 1 zinc transporter 6 


1075 


94 


1310 


gi33338012 


Homo sapiens 


AF173387 1 MSTP103 


942 


95 


1311 


gil2053097 


Homo sapiens 


hypothetical protein 


2127 


99 


1311 


gi23 170343 


Drosophila 


CG31556-PA 


iyy 


&y 






melanogaster 








1311 


gi854065 


Human hernesviru<? 6 


U88 


993 


19 


1312 


eil8605758 


ft/fii 5 ; mii9ciihi t ; 

ITlUi} iUUJVUlUO 


°Tn0409G1 \ H\\c nrofpin 


1 'XA'X 


QS 

yo 


1312 


ei6526769 


Hnmn csmi/*nc 


HRTHFR9 flftt 
1 lrvii ll DZfWj 


1 

1UDJ 


y 1 


1312 


ri7291408 




CG] 1906-PA 


SO 9 
oZZ 


jO j 






melanogaster 








1313 


eil9263985 


Homo <?anien<? 


Wvnofhptifiil nrntpin 


IjDj 


yy 








MGC26766 






1313 


gil9528309 


Drosophila 


LD02310D 


S71 








melanogaster 








1313 


gi7294955 


Drosophila 


CG4080-PA 


573 


55 






melanogaster 








1314 


gi 15030250 


Mus musculus 


Urebl -pending protein 


5270 


95 


1314 


gi22090626 


Homo sapiens 


HECT domain Drotein LASU1 


11690 


99 


1314 


gi6841194 


Homo sapiens 


AF161390 1 HSPC272 


9665 

7UUJ 


QO 


1315 


gi 13 182757 


Homo sapiens 


AF212238 1 HTPAP 

-* V A A# 1 4-&imJ KJ X X X X JL ill 


781 


oy 


1315 


gi2 1542541 


Homo sapiens 


Similar to HTPAP Drotein 


1074 


91 


1315 


gi28381093 


Drosophila 


CG12746-PD 


421 


^7 






melanogaster 








1316 


gil3 182757 


Homo sapiens 


AF212238 1 HTPAP 


915 


100 


1316 


gi2 1 542541 


Homo sapiens 


Similar to HTPAP protein 


1204 


99 


1316 


gi28381093 


Drosophila 


CG12746-PD 


539 


43 






melanogaster 








1317 


gil4424540 


Homo sapiens 


AAH09293 


1146 


93 


1317 


gil5342051 


Homo sapiens 


AAH 13297 


1146 


93 


1317 


gi30582231 


Homo sapiens 




1146 


93 


1319 


gil4715055 


Homo sapiens 


MGC9564 protein 


487 


31 


1319 


gil64I6764 


Homo sapiens 


AF3 15594 1 FKSG16 


2369 


99 


1319 


gi29436772 


Danio rerio 


Similar to DNA segment Chr 


514 


30 








11, ERATO Doi 18, expressed 






1320 


gil3905212 


Mus musculus 


RIKEN cDNA 1200006F02 


257 


77 


1320 


gi 164 16764 


Homo sapiens 


AF3 15594 1 FKSG16 


323 


98 


1320 


gi3 1873637 


Homo sapiens 


hvDothetical Drotein 


323 


98 


1321 


gi32330803 


Mus musculus 


Dodocan orotein 


2839 




1321 


gi32330805 


Homo sapiens 


oodocan orotein 


3143 


99 
yy 


1321 


gi33636569 


Drosonhila 


RF27764n 


jy § 


97 






melanogaster 








1322 


ffi20258604 


XA c\tt\c\ csinif^nc 


bid i dtiu Dinuing ig-UKc 


l*f /u 


9.A 
o*f 








1CUL111 J 






1322 


ei20988662 


Homo <?anien<i 














lectin 5 






1322 


gi9454520 


Homo sapiens 


AC0 18755 5 SIGLEC5 


1470 


84 


1323 


gi20258604 


Homo sapiens 


sialic acid binding Ig-like 


1372 


87 








lectin 5 






1323 


gi20988662 


Homo sapiens 


sialic acid binding Ig-like 


1372 


87 








lectin 5 






1323 


gi9454520 


Homo sapiens 


AC018755 5 SIGLEC5 


1372 


87 


1324 


gil3 183078 


Homo sapiens 


AF237652_1 a disintegrin-like 


602 


74 








and metalloprotease domain 
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l/WStl 111 wvil 


C CAAWA 


Trlpnfifv 








with thrombospondin type I 
motifs-like 3 






1324 


gil5099921 


Homo sapiens 


AF176313_1 ADAM-TS 
related protein 1 


874 


98 

.70 


1324 


gi20987759 


Homo sapiens 


Similar to ADAMTS-like 1 


886 


99 


1325 


gil78836 


Homo sapiens 


apolipoprotein C-II 


424 


89 


1325 


gi30582255 


Homo sapiens 


apolipoprotein C-II 


418 


88 


1325 


gi757915 


Homo sapiens 


apoCII protein 


424 


89 


1326 


gil78836 


Homo sapiens 


apolipoprotein C-II 


424 


89 


1326 


gi30584853 


synthetic construct 


Homo s aniens anoliDoorotein 
C-II 


422 


88 

oo 


1326 


gi757915 


Homo sapiens 


apoCII protein 


424 


89 


1327 


ei 15779 162 


FTnmo QaniprK 


A AH 14644 


All 

*T / / 


100 

lUU 


1327 


gi21619424 


Homo sapiens 


Similar to LOC150580 


477 


100 


1328 


gil4715231 


Homo sapiens 


DMBTl/8kb.2 protein 


1486 


40 


no° 

LJAo 




vjrycioiagus 

fMiniPiiliic 
L'UlliOUi lib 


nensin 






1328 


gi6624922 


Homo sapiens 


DMBTl/8kb.l protein 


1494 


41 






noino sapiens 


AEM1AQ07 1 QUO rlr\rnQtn 

Ar i ov uz_ i o rtz oomain- 
containing phosphatase anchor 

nmfpiTi OH 


GQ1 

yy l 




1329 


gi 16033597 


Homo sapiens 


AF4 16904 J SH2 domain- 
protein 2d 


1003 


99 


1329 


gi208L0036 


Homo sapiens 


Fc receptor-like protein 3 


985 


99 


1 J JU 


m 98974490 


rxuiiiu bapienb 


lipoma nivivji^ iusion-pdi uier- 

lilr** r\rr»+Ain 
line piuicm 


I loo 




1330 


ei30 102428 


l?uf"hic n r\r\?F k crt pnc 
rvaLLUb llUi VCglCUo 


WMfiTP fiicinn-*vjrtnpr_1i1rp» 

riivivJiv^ iubioii-p<iriner-iiivc 
protein 


1 147 


yo 


1330 


m 3041 1045 


N^iiq mncfiilnc 


Similar to linrmia HN/friTr* 
OlIIllloT lO lipUIIld rilVlvJlv^ 

fusion partner 


1 141 


QA 
y*r 


1331 


ail 2060826 


Hnmn cntii#*nc 

1 LvJllIW OdUlClio 


defined breast cancer antigen 
NY-BR-70 


ov / 


11 


1331 


eil7426418 


Mii<% mn*;ciilii<5 


Miiiiii/uuilll iciatvU pivslvlll 


788 


100 

IVU 


1331 


gil9484098 


Mus musculus 


calmodulin-like 4 


783 


99 


1332 


gi 10726831 


Drosophila 

m f* 1 0 ti f\ on c t 1> r 
1 1 1C I a.1 lUgdo LCI 


CG9986-PA 


141 


25 


133? 


gi!6741164 


Mus musculus 


DNA segment, Chr 6, Wayne 
State University 163, expressed 


938 


100 




ail 

gl I /OO^nJO 


o rt r\ \\ 1 1 o 

i^robopniia 

ma! onAftopfiar 

incianogaoier 




1 41 


ZJ 


mi 
ljjj 


ail 160/* 044 


T-l/*vms\ com one* 

fiuino sapiens 




W N To precursor 


zuuu 


i nn 
1UU 


1333 


gil3279251 


Homo sapiens 


AAH04329 Similar to 
wingless-related MMTV 
liucgrduon sue o 


2000 


100 


1333 


gi30583751 


Homo sapiens 


wingless- type MMTV 
integration site family, member 
6 


2000 


100 


1334 


gi 19744304 


Homo sapiens 


AF46 1 760_ 1 zinc transporter 5 


463 


94 


1334 


gi20135611 


Homo sapiens 


zinc transporter ZnT-5 


463 


94 


1334 


gi23270961 


Mus musculus 


Similar to zinc transporter 
ZTL1 


405 


85 


1335 


gi 18480366 


Mus musculus 


olfactory receptor MOR145-1 


310 


74 


1335 


gi21928214 


Homo sapiens 


seven transmembrane helix 


301 


77 
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Tdpntifv 

lUCUUlj 








receptor 








fri37063^18 


IV^nc miicpiilnc 

IVIUO lilUOUUlUo 


oiiaocory rcvcpiui 

GA_x6K02T2PVTD- 

140^4886-140^9^7 


^10 

J JLU 


74 


1336 


gj 12654633 


Homo sapiens 


Protein inhibitor of activated 

O Lf\lJ 


3277 


100 




<ri90QRRRSfi 


riuiiiu odpieiio 


protein lnniunur ox dt/iivdieu 
STAT3 


1977 


100 




gUVJOL7l 1 


Hrtmn com An o 
XiUlllvJ oa.pi Gilo 


protein lnnjunur ui dwuvdicu 
STAT3 


1977 ' 


100 


1337 


ai?744Q07S 


Wl CUk/ltl Ulii lo 

m n ^ ram hi a i ^ 


olcaluyi'V^UrV UCMtUl aoC 


1 176 

1 1 (U 


71 


1337 


gi30350098 


Homo sapiens 


AF389338_1 acyl-CoA- 

He^atura^e 

UVOULUi flOV 


1769 


99 


1337 


ei4469173 


Oallus paling 




1149 


71 


1338 


gil4030861 


Homo sapiens 


paraneoplastic neuronal 
antipen MAI 


1830 


99 


1338 






A 90^ OR 1 naranpnnlaotfr 
antijypn* M*A1 


1834 


100 


1338 


gi24658774 


Homo sapiens 


paraneoplastic antigen MAI 


1834 


100 


1339 


gi29468118 


Homo sapiens 


AF357888J PAP-2-like 
protein 2 


1695 


100 


1JJ7 


glj 1JOV/JJJ 


Hnmn canipnc 
nuiiivj oopiciio 


nlactiVif-v rf»1df<»r1 cr^nf* 9 
piaoll^lty IClaLCU gCllO Z» 


IfiQS 


100 ( 


1339 


gi32l86953 


Homo sapiens 


lipid phosphate phosphatase- 
related protein type 3 


1695 


100 


H40 


mill ^760^ 


nomo s dpi ens 




1Q11 


100 


1340 


gi20809333 


Homo sapiens 


actin like protein 


1928 


99 


1340 


gi684936 


Homo sapiens 


peptide with resemblance to 
the actin family; the actual start 
of the coding region has not 
ueen ueieiiuincu 


1362 


88 


1^41 


oil 1 177^10 
gl 1 1 I / / D L\J 


ivdiLus norvcgicus 


AE9R71fin 1 tonrlpm nnro 

Ar^o/juu^i idnaem pore 
domain potassium channel 
THIK-2 






1341 


gill 1775 14 


Homo sapiens 


AF287302_1 tandem pore 

/jnmn 7 n r\nfacciiim f*Hnnnf*l 
uuuicuii puiaooiuiu uiicaiiiici 

THIK-2 


2234 


100 


1341 


ei28839529 


Homo ^aniens 


Potassium channel subfamilv 
K, member 12 


2234 


100 


1342 


gil4198194 


Mus musculus 


CDNA sequence BC008155 


606 


77 


1342 


eil4336716 


Homo saoiens 


AE006464 16 similar to 
FBan0003337 


1216 


100 


1342 


gi7300722 


Drosophila 

1 1 ic i ai lugao ici 


CG3337-PA 


326 


40 j 


1343 


gil 1862939 


Mus musculus 


DDM36 


1117 


43 




mil S£9Q41 


Mus musculus 


UJJiViJOIl 


1 L\JJ 




1343 


ffi 19570398 


1-Tnmn <innf f*n<; 


hDDM36 

111/LyiVlJU 


1 120 


*tj 


1344 


gi2 1744725 


Homo sapiens 


AF478693J glycosyl- 
phosphatidyl-inositol-MAM 


4898 


98 


1344 


gi25005318 


Sus scrofa 


MAM domain containing 
glycosylphosphatidylinositol 
anchor 1 


4355 


95 


1344 


gi25005320 


Sus scrofa 


glycosylphosphatidylinositol 
anchor 1 protein 


4224 


94 


1345 


gi!2276198 


Homo sapiens 


AF333487 1 FKSG40 


1020 


100 
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1345 


gil2408250 


Homo sapiens 


FKSG28 


1020 


100 


1345 


gil 8652934 


Xenopus laevis 


Mig30 


634 


49 


1346 


gi21410151 


Mus musculus 


LOC213895 protein 


1657 


73 


1346 


gi27696627 


Homo sapiens 


Ribosome biogenesis protein 
BMS1 homolog 


4190 


99 


1346 


gi7294027 


Drosophila 
melanogaster 


CG7728-PA 


1345 


43 


1347 


gil2842044 


Mus musculus 


unnamed protein product 


554 


71 


1347 


gil8921437 


Mus musculus 


201 0004 A03Rik protein 


850 


70 


1347 


gi20987450 


Homo sapiens 


LOC146433 


1160 


95 


1348 


gil016012 


Rattus norvegicus 


neural cell adhesion protein 
BIG-2 precursor 


5147 


92 


1348 


gi26891535 


Homo sapiens 


contactin 4 


5366 


98 


1348 


gi29837411 


Homo sapiens 


BIG-2 


5366 


98 


1349 


gi30 102449 


Homo sapiens 


lipoma HMGIC fusion-partner- 
like protein 


1161 


97 


1349 


gi30908798 


Homo sapiens 


lipoma HMGIC fusion partner- 
like protein 4 


952 


80 


1349 


gi30908800 


Rattus norvegicus 


lipoma HMGIC fusion partner- 
like protein 4 


951 


80 


1350 


gil3097705 


Homo sapiens 


AAH03559, member 3 


2028 


95 


1350 


gil340142 


Homo sapiens 


alpha 1 -antichymotr ypsin 


2024 


95 


1350 


gi21961493 


Homo sapiens 


, member 3 


2025 


95 


1351 


gil850850 


Murid herpesvirus 4 


serine threonine rich 
glycoprotein 


166 


30 


1351 


gi2l618556 


Homo sapiens 




3529 


91 


1351 


gi33304372 


Homo sapiens 


tastin 


3524 


91 


1352 


gi 12053849 


Homo sapiens 


DREV protein 


1689 


100 


1352 


gi 12053851 


Homo sapiens 


DREVl protein 


1673 


99 ! 


1352 


gil2053853 


Homo sapiens 


DREV protein 


1689 


100 


1353 


gi 14627081 


Homo sapiens 


AF367017_1 caspase-1 
dominant-negative inhibitor 
Pseudo-ICE 


492 


100 


1353 


gi2 1 707335 


Homo sapiens 


Similar to CARD only protein 


462 


100 


1353 


gi33793 


Homo sapiens 


interieukin-lB converting 
enzyme 


445 


92 


1355 


gi22760096 


Homo sapiens 


unnamed protein product 


1051 


93 


1355 


gi27883913 


Homo sapiens 


POTE 


497 


48 


1355 


gi28279813 


Homo sapiens 


Similar to hypothetical protein 
DKFZp434A171 


860 


99 


1356 


gill 125348 


Homo sapiens 


putative protein kinase 


11920 


99 


1356 


gi6933864 


Homo sapiens 


kinase deficient protein KDP 


3408 


100 


1356 


gi8272557 


Rattus norvegicus 


AF227741 1 protein kinase 
WNKl 


5436 


73 


1357 


gil 1 125348 


Homo sapiens 


putative protein kinase 


9671 


99 


1357 


gi20987908 


Mus musculus 


LOC269796 protein 


1553 


82 


1357 


gi8272557 


Rattus norvegicus 


AF227741 1 protein kinase 
WNKl 


5436 


73 


1358 


gil0946203 


Homo sapiens 


AF272363_1 neuromedin U 
receptor 2 


785 


100 


1358 


gil6877377 


Homo sapiens 


AAH16938 neuromedin U 
receptor 2 


785 


100 


1358 


gi9944990 


Homo sapiens 


AF292402_1 neuromedin U 
receptor-type 2 


785 


100 
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1359 


gi 15020809 


Takifugu rubripes 


putative methionyl tRNA 
synthetase 


1823 


64 


1359 


gil7861592 


Drosophila 
melanogaster 


GH13807p 


1212 


45 


1359 


gi 23 171238 


Drosophila 
melanogaster 


CG31322-PA 


1212 


45 


1360 


gil5341975 


Homo sapiens 


AAH13184 Similar to major 
histocompatibility complex, 
class II, DP beta 1 


437 


72 


1360 


gil7389919 


Homo sapiens 


AAH17967 Similar to major 
histocompatibility complex, 
class II, DP beta 1 


819 


100 


1360 


gil88479 


Homo sapiens 


HLA-DPB1 


437 


72 


1361 


gil9701013 


Homo sapiens 


unnamed protein product 


1143 


99 


1361 


gi3342737 


Homo sapiens 


R26660. 2, partial CDS 


1024 


100 


1361 


gi3478640 


Homo sapiens 


R26660__2, partial CDS 


154 


100 


1362 


gil5779083 


Homo sapiens 


AAH14609 


1172 


99 


1362 


gi3342737 


Homo sapiens 


R26660_2, partial CDS 


1002 


96 


1362 


gi3478640 


Homo sapiens 


R26660_2, partial CDS 


154 


100 


1363 


gil3991167 


Homo sapiens 


sialic acid-binding 
immunoglobulin-like lectin-like 
long splice variant 


2879 


99 


1363 


gi 14625 822 


Homo sapiens 


AF282256 1 Siglec-Ll 


2879 


99 


1363 


gi23272769 


Homo sapiens 


SIGLEC-like 1 


2879 


99 


1364 


gil5l32186 


Homo sapiens 


unnamed protein product 


1644 


100 


1364 


gi 15 132529 


Homo sapiens 


unnamed protein product 


1644 


100 


1364 


gi21439502 


Homo sapiens 


unnamed protein product 


1644 


100 


1365 


gi 19353230 


Homo sapiens 


interleukin 1, delta 


823 


100 


1365 


gi6165336 


Homo sapiens 


interleukin-l-like protein 1 


823 


100 | 


1365 


gi9651789 


Homo sapiens 


AF230377_1 interleukin-1 
delta 


823 


100 


1366 


gil77870 


Homo sapiens 


alpha-2-macroglobulin 
precursor 


2765 


40 


1366 


gi25303946 


Homo sapiens 


alpha-2-macroglobulin 


2765 


40 


1366 


gi579594 


Homo sapiens 


alpha 2-macroglobulin 690-740 


2760 


40 


1367 


gi25990364 


Homo sapiens 


AF3 19622 1 P-glycoprotein 


555 


98 


1367 ! 


gi27656757 


Takifugu rubripes 


Mdr3 


311 


52 


1367 


gi4574224 


Fundulus heteroclitus 


AF099732_1 multidrug 

resistance transporter homolog 
■ - ■ ■ r *? , . 


287 


49 


1368 


gil2805221 


Mus musculus 


Lymphocyte antigen 6 
complex, locus A 


713 


100 


1368 


gi 198924 


Mus musculus 


Ly-6A.2 


713 


100 


1368 


gi201U3 


Mus musculus 


T-cell activation protein 


713 


100 


1967 


gi 13543526 


Homo sapiens 


AAH05921 


616 


96 


1967 


gi 18088830 


Homo sapiens 


AAH20756 


616 


96 


1967 


gi3058269l 


Homo sapiens 




616 


96 


1968 


gi 13543526 


Homo sapiens 


AAH05921 


616 


96 


1968 


gil8088830 


Homo sapiens 


AAH20756 


616 


96 


1968 


gi30582691 


Homo sapiens 




616 


96 


1969 


gi 13543526 


Homo sapiens 


AAH05921 


616 


96 


1969 


gil8088830 


Homo sapiens 


AAH20756 


616 


96 


1969 


gi30582691 


Homo sapiens 




616 


96 


1970 


gil3543526 


Homo sapiens 


AAH05921 


616 


96 


1970 


gil8088830 


Homo sapiens 


AAH20756 


616 


96 
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1970 


gi30582691 


Homo sapiens 




olo 


96 


1971 


gil2653501 


Homo sapiens 


SERPINF1 nrntein 


T1 in 
21 iy 


99 


1971 


gil5217079 


Homo sapiens 


AF400442 1 niompnt 
eDithelium-flprivpH fhrtnr 


O 1 1C 

2125 


99 


1971 


gi30583283 


Homo sapiens 


. member 1 


zi iy 


99 


1972 


gi20269957 


Sus scrofa 


AF498759 1 nhnsnhnlinasp P 

delta 4 


loo 


96 


1972 


gi21307610 


Mus musculus 


phospholipase C delta 4 


158 


90 


1972 


gi571466 


Rattus norvegicus 


DhosnholinaQ** f rfpltn-d 


101 


84 


1973 


gi 17864023 


Homo sapiens 


AF450090 1 KCCR13L 


3299 


94 


1973 


gi22760385 


Homo sapiens 


uiuiauit^u piuivill piUUUCX 


3290 


94 


1973 


gi22761016 


Homo sapiens 


linnnmwl nrntein nrr»rli?r»f 


3299 


94 


1975 


gi!9684107 


Homo sapiens 




120 


92 


1975 


gi32966069 


Homo sapiens 


OD^QT 7 niiplpntiMacp 


i o/\ 

120 


92 


1975 


gi469l263 


Homo sapiens 




120 


92 


1976 


^gil 1493483 


Homo sapiens 




364 


71 


1976 


gi2580578 


Homo sapiens 


uhiniiitmiQ TPR mntif V 
isoform 


339 


75 


1976 


gi8572229 


Homo sapiens 


ubiquitous TPR-motif protein 
Y isoform 


339 


75 


1977 


gil8848355 


Mus musculus 


Coa6 nrotein 


2Uoj 


oo 


1977 


gi30047245 


Mus musculus 


Coq6 protein 


2090 


85 


1977 


gi4680659 


Homo sapiens 


x\i ij^tt i v>vji- proiem 


00*70 

2378 


98 


1978 


gil2654881 


Homo sapiens 


AAH01284 


331 


78 


1978 


gil710216 


Homo s aniens 


UIlKllOWu 


311 


73 


1978 


gi28799226 


Homo sapiens 


unnamed protein product 


252 


65 


1979 


gi 11493483 


Homo sanien^ 


A PI ^fi 1- 1*7 AQ DDm^cn 


143 


48 


1979 


gi3002527 


Homo sapiens 


neuronal thread protein AD7c- 

lNir 


161 


63 


1979 


gi32486167 


Homo sapiens 


An7r* ntp 


161 


63 


1980 


gi208 10589 


Homo sapiens 


similar to arsenite inducible 
tviN/v associaiea protein 


833 


99 


1980 


gi22945274 


Drosophila 
melanogaster 


CG12795-PA 


455 


54 


1980 


gi9651711 


Mus mu«?cii1ii<5 


/\r2Z44y4_i arsenite inducible 
RNA associated protein 


687 


80 


1981 


gil3241652 


Rattus norvegicus 


AF309558J supernatant 
protein iactor 


162 


87 


1981 


gil3543184 


Mus musculus 


SEC14-like 2 


162 


87 


1981 


gi6624130 


Rattus norvegicus 


AC004832_J similar to 45 kDa 
secretory protein 


169 


96 


1982 


gil 1066250 


Homo sapiens 


AF1 97937_1 presenilins 
associated rhomboid-like 
protein 


1392 


100 


1982 


gil3 177766 


Homo sapiens 


AAH03653 Similar to 
presenilins associated 
rhomboid-like protein 


1068 


80 


1982 


gil5559382 


Homo sapiens 


AAH14058 presenilins j 
associated rhomboid-like 
protein 


1389 


99 


1983 


gil 864091 


Rattus norvegicus 


PSD-95/SAP90-associated 
protein-3 


160 


100 


1984 


gil 1877274 


Homo sapiens 




2265 


100 


1984 


gi21667210 


Homo sapiens 


AF465765 1 


2265 


100 
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bactericidal/permeability- 
increasing protein-like 1 






1984 


glZ 1 /uo / /o 


Homo sapiens 


Bactericidal/permeability- 
increasing protein-like 1 


2258 


99 




gus /yjH/ 


caenornabdms 
elegans 




125 


36 




glZloU/ / / 1 


Homo sapiens 


organic anion transporter 2 


733 


100 


iyoo 


glZl /U/4/4 


Homo sapiens 


, member 7 


733 


100 






Homo sapiens 


AF097518_1 liver-specific 
transporter 


733 


100 


10R7 
iy<> / 


gllZoU41UD 


Homo sapiens 


AAH02905 Similar to 
CG15084 gene product 


589 


79 


1987 


gil3649459 


Homo sapiens 


AF250306_l putative SB1 15 
protein 


589 


79 


1987 


gi 18204670 


Mus musculus 


4930527D15Rik protein 


569 


75 


1988 


gil022323 


Mus musculus 


chain 


3354 


87 


1 QQQ 


gi537329 


Homo sapiens 


alpha-2 type IY collagen 


3752 


99 


1988 


gi556299 


Mus musculus 


alpha-2 type IV collagen 


3351 


87 


1QQQ 


gil72983l5 


Homo sapiens 


candidate tumor suppressor 
protein 


1360 


98 


1989 


gi786l733 


Homo sapiens 


AF176832J low density 
lipoprotein receptor related 
protein-deleted in tumor 


1360 


98 


1989 


gi8926243 


Mus musculus 


AF270884_1 low density 
lipoprotein receptor related 
protein LRP1B/LRP-DIT 


1181 


84 


1990 


gil72983l5 


Homo sapiens 


candidate tumor suppressor 
protein 


1360 


98 


1QQO 


rriTQ^I 111 
gl/OOl 155 


Homo sapiens 


AF176832_1 low density 
lipoprotein receptor related 
protein-deleted in tumor 


1360 


98 


1990 


gi8926243 


Mus musculus 


AF270884_1 low density 
lipoprotein receptor related 
protein LRP1B/LRP-DIT 


1181 


84 


1QQ1 


gi 1 i4yj4oj 


Homo sapiens 


AF130H7 48 PRO2550 


408 


78 


1991 


gi 1872200 


Homo sapiens 


alternatively spliced product 
using exon 13A 


328 


75 


1991 


gi7770139 


Homo sapiens 


AFH99I7 13 PRO 1722 


328 


72 




rrl 1 ^*7v4 Art 


Drosophila 
melanogaster 


fat protein 


370 


37 


1 OQO 


mTinm t aa 
giZJUyjIOy 


Drosophila 
melanogaster 


CG7749-PA 


367 


41 


1 OQ1 

iyyz 


gi7295732 


Drosophila 
melanogaster 


CG3352-PA 


367 


38 


1 QOI 


1 CIA Art 

gllj /4U9 


Drosophila 
melanogaster 


fat protein 


370 


37 


1993 


gi23093109 


Drosophila 
melanogaster 


CG7749-PA 


367 


41 


1993 


gi7295732 


Drosophila 
melanogaster 


CG3352-PA 


367 


38 


1994 


gi27549552 


Homo sapiens 


dipeptidyl peptidase IV-related 
protein-2 


410 


89 I 


1994 


gi29293087 


Homo sapiens 


dipeptidyl peptidase 9 


410 


89 


1994 


gi35 13303 


Homo sapiens 


R26984 1 


476 


100 


1995 


gi32493172 


Homo sapiens 


pheromone receptor 


170 


96 
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rkpcrrSnfinn 
JJCACI ipuuu 


S_score 


Percentage^ 
Identity 


1995 


frtt9493 1 74 


nuiuu bdpicnb 


pncroniuiic i cccpiui 


1 7f\ 


yo 


1995 




nuiuu oapicilb 


{JJlCIUIUUIlv IGl/GptAJl 


1 /o 


i (\(\ 


1996 


ei 234683 68 


IV/fllC TYlllCflllllG 


1 9000nR94Ri1r nrnf^in 


7QQ 

lyy 


/CI 


1996 


gi27695305 


Mus musculus 


1200013F24Rik protein 


825 


76 


1006 

177U 


cri7SR99Q4 


Homo sapiens 


AT790flfi57 1 01 1 


781 


98 


1007 


ml 690870 


i^iona intestinal is 


myopiasmin-v^ i 


190 


29 


1997 


gi31419817 


Mus musculus 


Golgi autoantigen, golgin 
subfamily a, 3 


124 


26 


1997 


gi4582571 


Gallus gallus 


Hyperion protein, 419 kD 
isoform 


125 


26 


iyyo 


gllJO /Zo 1 J 


— : 

Homo sapiens 


fibulin-6 


1 AAA 

1099 


A O 

48 


1998 


gi 14575679 


Homo sapiens 


AF156100 1 hemicentin 


2159 


86 




m'187Q£<\Q 

gl-5o /y03o 


Caenorhabditis 
elegans 




636 


32 


1 QQQ 


* 1 A(\AA ACO 

gll4U44UDZ 


Homo sapiens 


AAH07950 


1105 


51 


1 0QO 


gi i / jyi/oZj 


Mus musculus 


heterogenous nuclear 
ribonucleoprotein U 


1104 


51 


1000 
iyyy 


glJOZZjjJ 


n 

Gallus gallus 


nuclear calmodulin-binding 
protein 


1 CCA 

1554 


64 


2000 


gi 17223626 


Homo sapiens 


ATP-binding cassette A10 


1683 


93 


2000 


gi32350914 


Homo sapiens 


ATP-binding cassette sub- 
family A member 10 


1675 


92 


2000 


gi32350969 


Homo sapiens 


ATP-binding cassette sub- 
family A member 10 


1675 


92 


9001 


gi i jj /HU/y 


— : 

Homo sapiens 


i Arlll4U protein 


3747 


99 


9001 


m *1 *3 'l 7/J 17Q 
gl LJJ /'r I /o 


Mus musculus 


1 Arlll4u protein 


1 A C A 

3454 


o c 

85 


2001 


gi28 175603 


Homo sapiens 


TAF3 protein 


2775 


99 




ml 749 QO** » 

gi I /*tzyuoo 


Ralstonia 
solanacearum 


DDADADTD A /"^YTT /~*/~\ A 

DEHYDROGENASE 

nYrnnocni tot 1 a cd 
UAllJUKJiUUU 1 Abb 

PROTEIN 


676 


61 


9009 


ai9977fi'^4 


Oceanobac i 1 1 us 
iheyensis HTE831 


acyl-CoA dehydrogenase 


660 


63 


9009 


cri9 89 5*009^ 
glZOZOVJUZJ 


ivius museums 


j / JiKoyti iukik protein 


9/4 


O >1 

84 


2003 


gi2l522776 


Homo sapiens 


unnamed protein product 


2998 


98 


900T 


01*9404799/1 
glZ*f U4 / ZZ*f 


Homo sapiens 


similar to EGF-like-domain, 
multiple 6 


2982 


98 


2003 


gi6752658 


Homo sapiens 


AF186084_1 epidermal growth 
factor repeat containing protein 


2984 


98 


2004 


gi!4530342 


Caenorhabditis 
elegans 




389 


51 


ZUU4 


giojilool 


Caenornabaitis 
elegans 


innc^iA 1 T TXT J 1 A 

AF195610_l LIN-41A 


389 


51 


9004 


glOJ.5 100 J 


Caenorhabditis 
elegans 


A 1 1 T TXT A 1 t> 


389 


51 


2005 


gi 1504026 


Homo sapiens 




5996 


00 


2005 


gi22725157 


Homo sapiens 


minor histocompatibility 
antigen HA-1 


5835 


99 


2005 


gi23272016 


Homo sapiens 


Similar to PTPL1 -associated 
RhoGAPl 


5675 


98 


2006 


gi 13274 120 


Homo sapiens 




995 


91 


2006 


gi6102996 


Mus musculus 


Vanin-3 


884 


78 


2006 


gi7 160973 


Homo sapiens 


VNN3 protein 


995 


91 


2007 


gi27463365 


Homo sapiens 


a disintegrin-like and 
metalloprotease with 


345 


93 
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thrombospondin type 1 motifs 
9B 






2007 


gi3876367 


Caenorhabditis 
elegans 




148 


39 


2007 


gi3879882 


Caenorhabditis 
elegans 




148 


39 


2008 


gil5963476 


Homo sapiens 


AF289221_1 alpha-adaptin A 
related protein 


2085 


94 


2008 


» « fn/n inn 

gi 15963477 


Homo sapiens 


AF289221J2 alpha-adaptin A 
related protein 


2118 


99 


2008 


gi43 14340 


A A 159-977 


Human alpha-adaptin A 
homolog 


2085 


94 


2009 


gil5488017 


Homo sapiens 


AF407274 1 EWI2 


3200 


100 


2009 


gi27497567 


Homo sapiens 


keratinocytes associated 
transmembrane protein 4 


3200 


100 


2009 


gi3 1753233 


Homo sapiens 


Immunoglobulin superfamily, 
member 8 


3200 


100 


2010 


gil5488017 


Homo sapiens 


AF407274 1 EWI2 


3200 


100 


2010 


gi27497567 


Homo sapiens 


keratinocytes associated 
transmembrane protein 4 


3200 


100 


2010 


gi3 1753233 


Homo sapiens 


Immunoglobulin superfamily, 
member 8 


3200 


100 


2011 


gil5488017 


Homo sapiens 


AF407274 1 EWI2 


3200 


100 


2011 


gi27497567 


Homo sapiens 


keratinocytes associated 
transmembrane protein 4 


3200 


100 


201 1 


gli 1753233 


Homo sapiens 


Immunoglobulin superfamily, 
member 8 


3200 


100 


ZU1Z 


gli j4ooU1 / 


— ; 

Homo sapiens 


Ar40/2/4 1 bWI2 


O OAA 

3200 


1 AA 

10U 


2012 


gi27497567 


Homo sapiens 


keratinocytes associated 
transmembrane protein 4 


3200 


100 


lull 


gUl loo 153 


— : 

Homo sapiens 


Immunoglobulin superfamily, 
member 8 


1AAA 

3200 


1 AA 

100 


2013 


gi 1405723 


Homo sapiens 


type X collagen 


198 


30 


2013 


gi30095 


Homo sapiens 


3 


198 


30 


zUli 


gl/5 /J5 32 


Homo sapiens 




198 


30 


2014 


gil5 145793 


Sus scrofa 


basic proline-rich protein 


233 


26 


2014 ; 


gil 5 145795 


Sus scrofa 


basic prohne-nch protein 


205 


26 


2014 


gi25056007 


Zea mays 


AF1 59297 J extensin-like 
protein 


203 


26 


2015 


gi21992 


Volvox carteri 


extensin 


158 


37 


2015 


gi2429362 


Santalum album 


proline rich protein 


166 


39 


2015 


gi32488576 


Oryza sativa 
(japonica cultivar- 
group) 


OSJNBa0067K08.27 


157 


35 


2016 


gi 12002042 


Homo sapiens 


AF063606_1 brammy048 
protein 


659 


70 


2016 


gi 17225331 


Homo sapiens 


AF325 115 J MY0876G05 
protein 


659 


70 


2016 


gil7646146 


Homo sapiens 


AF314542J. B lymphocyte 
activation-related protein 


727 


56 


2018 


gil3161063 


Homo sapiens 


AF332218_1 protocadherin 1 1 


746 


56 


2018 


gil3161066 


Homo sapiens 


AF332219_1 protocadherin 1 1 


746 


56 


2018 


gi9845485 


Homo sapiens 


AF169692_1 protocadherin-9 


1349 


100 


2019 


gil6552038 


Homo sapiens 


unnamed protein product 


2139 


99 


2019 


gi21410124 


Mus musculus 


3230402E02Rik protein | 


1334 


60 
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2019 


gi5688958 


Homo sapiens 


PMMLP 


2140 


100 


2020 


gi21734445 


Rattus norvegicus 


BMP/Retinoic acid-inducible 
neurai-specific protein-2 


3958 


95 


2020 


gi21734447 


Rattus norvegicus 


BMP/Retinoic acid-inducible 
neural-specific protein-3 


2948 


70 


2020 


gi30348610 


Gallus gallus 


BMP/retinoic acid-inducible 
neural-specific protein 


2090 


52 


2021 


gi23272677 


Homo sapiens 


Similar to zinc finger protein 
208 


467 


80 


2021 


gi26251755 


Homo sapiens 


ZNF431 protein 


449 


78 


2021 


gi30421228 


Homo sapiens 


zinc finger protein 430 


572 


100 


2022 


gi23272677 


Homo sapiens 


Similar to zinc finger protein 
208 


467 


80 


2022 


gi26251755 


Homo sapiens 


ZNF431 protein 


449 


78 


2022 


gi30421228 


Homo sapiens 


zinc finger protein 430 


572 


100 


2023 


gi 12 12965 


Homo sapiens 


transmembrane protein 


358 


70 


2023 


gi!213221 


Rattus norvegicus 


transmembrane protein 


354 


69 


2023 


gi 19683999 


Homo sapiens 


coated vesicle membrane 
protein 


358 


70 


2024 


gi 1199524 


Homo sapiens 


acid phosphatase 


2246 


99 


2024 


gil3111975 


Homo sapiens 


AAH03160 acid phosphatase 
2, lysosomal 


2242 


99 


2024 


gi30584617 


synthetic construct 


Homo sapiens acid 
phosphatase 2, lysosomal 


2242 


99 


2025 


gil5625570 


Homo sapiens 


AF411981 1 centaurin beta5 


353 


100 


2025 


gi30 109272 


Homo sapiens 


CENTB5 protein 


505 


99 


2025 


gi4688902 


Homo sapiens 


centaurin beta2 


270 


48 


2026 


gi27693942 


Homo sapiens 


Similar to expressed sequence 
AI449432 


1083 


42 


2026 


gi2789430 


Homo sapiens 


repressor protein 


1084 


42 


2026 


gi5630080 


Homo sapiens 


AC004890 2 


1077 


42 


2027 


gil 1345382 


Homo sapiens 


AF30880l_l vacuolar protein 
sorting protein 16 


2977 


99 


2027 


gi 12 140290 


Homo sapiens 




2983 


99 


2027 


gil5553046 


Mus musculus 


Vpsl6 


2932 


97 


2028 


gi30141048 


Homo sapiens 


Nogo-66 receptor homolog-1 


294 


100 


2028 


gi30141052 


Rattus norvegicus 


Nogo-66 receptor homolog-1 


270 


92 


2028 


gi32351287 


Rattus norvegicus 


Nogo-66 receptor homolog 2 


149 


53 


2029 


gi202592 


Rattus norvegicus 


prealpha-2-macroglobulin 


238 


40 


2029 


gi671864 


Gallus gallus 


ovomacroglobulin, ovostatin 


230 


40 


2029 


gi671865 


Gallus gallus 


ovomacroglobulin, ovostatin 


230 


40 


2030 


gil5778556 


Homo sapiens 


AF414429J alpha-l-B 
glycoprotein precursor 


131 


92 


2031 


gi200057 


Mus musculus 


neuronal glycoprotein 


698 


94 


ZUJl 


gi29837411 


Homo sapiens 


BIG-2 


554 


75 


2031 


gi563133 


Rattus norvegicus 


BIG-1 protein 


692 


94 


2032 


gil6550078 


Homo sapiens 


unnamed protein product 


763 


100 


2032 


gi28 175743 


Homo sapiens 


similar to hypothetical protein 
FLJ30803 


763 


100 


2032 


gi30354720 


Mus musculus 


AI427653 protein 


756 


100 


2033 


gil6550078 


Homo sapiens 


unnamed protein product 


763 


100 


2033 


gi28 175743 


Homo sapiens 


similar to hypothetical protein 
FU30803 


763 


100 


2033 


gi30354720 | Mus musculus 


AI427653 protein 


756 


100 
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CPA TTfc 


Hlt_LU 


Species 
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Sjscore 


Percentage_ 






. 


— - 




Identity 


OA1/1 

2U34 


mO 1 QOOAOO 


Homo sapiens 


seven transmembrane helix 


1711 


88 








receptor 






2U34 


mOAOR^HOQ 
glZH-ZODl/Zl/ 


— — : : 

Homo sapiens 


G-protein coupled receptor 


6754 


97 








r^v%T> 11/1 






OAQ/1 
2U34 


glJjZjU /o 


— ; 

Rattus norvegicus 


seven transmembrane receptor 


5038 


72 


OAK 
2U3j 


gii iyi /ou/ 


Homo sapiens 


rirrl protein 


A*) A 

434 


59 


2035 


'11 A1 O C 1 

gil3938351 


Homo sapiens 


AAH07307 Similar to zinc 


432 


63 








finger protein 268 






2035 


gi3 135968 


Homo sapiens 




440 


58 


2036 


gi 13097633 


Homo sapiens 


AAH03534 Similar to ATPase, 


373 


84 








Class I, type 8B, member 1 






2036 


A A AHA O 

gi33440008 


Homo sapiens 


possible aminophospholipid 


406 


91 








translocase ATP8B2 






OAT C 

2036 


gi3628757 


t r _ • 

Homo sapiens 


FIC1 


373 


84 


2038 


gii 1558486 


Homo sapiens 


B-cell lymphoma/leukaemia 


1314 


99 








1 1 A short form 






2038 


gi!2150278 


Homo sapiens 


A T1AAAA4 ^ H yH/^TTA a 9 

AF080216_l C2H2-type zinc- 


1197 


98 








finger protein; EVI-9 






2038 


gi30410854 


Mus musculus 




1312 


98 


2039 


gi32394378 


Homo sapiens 


forkhead-associated domain 


1735 


94 








histidine-triad like protein 






2039 


gi32394380 


Bos taurus 


forkhead-associated domain 


1540 


83 








histidine-triad like protein 






2039 


gi32394382 


Sus scrofa 


forkhead-associated domain 


1575 • 


84 








histidine-triad like protein 






OA/1 A 

2040 


gi32394378 


Homo sapiens 


forkhead-associated domain 


1735 


94 








histidine-triad like protein 






2040 


gi32394380 


Bos taurus 


forkhead-associated domain 


1540 


83 








histidine-tnad like protein 






OA/in 
ZU4U 


gi323y43o2 




Sus scrofa 


forkhead-associated domain 


1575 


84 








histidine-triad like protein 






OA/1 1 

2U41 


gl323y43 /o 


; 

Homo sapiens 


forkhead-associated domain 


1735 


94 








histidine-triad like protein 






OA/1 1 

2U41 


rrJ101Q>l O OA 

gl32jy4JoU 


Bos taurus 


forkhead-associated domain 


1540 


83 






-— 


histidine-triad like protein 






OA/l 1 

2U41 


gl32394382 


Sus scrofa 


forkhead-associated domain 


1575 


84 








histidine-triad like protein 






2042 


gl2o454883 


Homo sapiens 


hypothetical protein HSPC148 


1181 


100 1 


OA/11 

2042 . 


gi6523797 


Homo sapiens 


AFll0775_l adrenal gland 


1181 


100 








protein AU-002 






2042 


gi6841518 


Homo sapiens 


AF 16 1497 1 HSPC148 


1178 


99 


2043 


gi 14009597 


Homo sapiens 


A T**> fts* S 1 /*\ 1 1 1 • 1 1 •» 

AF282619_1 lysyl oxidase-hke 


1569 


98 








3 protein 






OA/10 


rri 1 /MQ«AA 


; 

Homo sapiens 


Ar3 1 13 13_1 lysyl oxidase-hke 


1569 


no 

98 








3 protein 






2043 


eil5 186770 


Homo sanierw 


AF284815 1 lv«;v1 oxidate-like 




98 








protein 






2044 


gil0834722 


Homo sapiens 


AF258588J PP5656 


892 


89 


2044 


gi21706836 


Mus musculus 


Gyltllb protein 


1056 


87 


2044 


gi22713410 


Homo sapiens 


GYLTL1B protein 


1205 


100 


2045 


gi7209721 


Mus musculus 


DD57 


2242 


88 


2045 


gi7209723 


Homo sapiens 


WD-repeat like sequence 


2483 


100 


2045 


gi82 17485 


Homo sapiens 




2480 


99 


2046 


gil3592175 


Leishmania major 


AC084329 I ppg3 


140 


28 


2046 


gi28828184 


Dictyostelium 


similar to Leishmania major. 


179 


28 
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Identity 






discoideum 


Ppg3 






2046 


gi3873550 


Schizosaccharomyce 


SPBC215.13 


147 


24 






s pombe 








2047 


gi21 104460 


Homo sapiens 


OK/SW-CL.I9 


206 


100 


2047 


gi32425794 


Homo sapiens 


NJMU-Rl protein 


206 


100 


2047 


gi32450708 


Homo sapiens 


NJMU-Rl protein 


206 


100 


2048 


gil3277972 


Mus musculus 


phosphatidate 


2270 


95 ' 








cytidylyltransferase 2 






2048 


gil9344052 


Homo sapiens 




2360 | 


99 


2048 


gi4 186023 


Homo sapiens 


CDS2 protein 


2360 


99 


2049 


gil7862928 


Drosophila 


SD03549p 


121 


35 






melanogaster 








2049 


gi29387317 


Mus musculus 


l2000HO22Rik protein 


670 


89 


2049 


gi7297878 


Drosophila 


CG1494I-PA 


121 


35 






melanogaster 








2050 


gi 13562004 


Nephila 


AF350276_1 major ampullate 


251 


33 






madagascariensis 


spidroin 2-like protein 






2050 


gi7 106224 


Nephila clavipes 


flagelliform silk protein 


252 


32 


2050 


gi7106228 


Nephila inaurata 


flagelliform silk protein 


277 


34 






madagascariensis 


[Nephila madagascariensis] 






2051 


gil2018147 


Chlamydomonas 


AF309494_1 vegetative cell 


198 


31 






reinhardtii 


wall protein gpl 






2051 


gi 15 145793 


Sus scrofa 


basic proline-rich protein 


204 


29 


2051 


gi 15 145797 


Sus scrofa 


basic proline-rich protein 


200 


30 


2052 


gii6877193 


Homo sapiens 


AAH16860 G protein-coupled 


2320 


99 








receptor, family C, group 5, 












member C 






2052 


gi30583709 


Homo sapiens 


G protein-coupled receptor, 


2320 


99 








family C, group 5, member C 






2052 


gi8 118032 


Homo sapiens 


AF207989_1 orphan G-protein 


2320 


99 








coupled receptor 






2053 


gi 15679980 


Homo sapiens 


CI 14 protein 


930 


99 


2053 


gi 16769562 


Drosophila 


LD38910p 


328 


47 






melanogaster 








2053 


gi7302978 


Drosophila 


CG8441-PA 


328 


47 






melanogaster 








2054 


gil0726751 


Drosophila 


CG13623-PA 


333 


53 






melanogaster 








2054 


gi21430012 


Drosophila 


GH27470p 


333 


53 






melanogaster 








2054 


gi7406400 


Arabidopsis thaliana 


putative protein 


317 


45 


2055 


gi 139590 18 


Homo sapiens 


AF361746_1 endothelial cell- 


1578 


99 








selective adhesion molecule 






2055 


gil3991773 


Mus musculus 


AF361882_1 endothelial cell- 


1188 


76 








selective adhesion molecule 






2055 


gi29 165726 


Mus musculus 


Endothelial cell-selective 


1188 


76 








adhesion molecule 






2056 


gil5422171 


Homo sapiens 


22 kDa peroxisomal membrane 


862 


99 








protein 2 






2056 


gi297437 


Rattus norvegicus 


peroxisomal membrane protein 


680 


76 


2056 


gi8164184 


Homo sapiens 


22kDa peroxisomal membrane 


862 


99 








protein-like 






2057 


gi 1 1994465 


Arabidopsis thaliana 


contains similarity to late 


141 


39 








embryogenesis abundant 












protein~gene_id:MLD 14. i 6 
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2057 


gi21326031 


Oryzias latipes 


choriogenin H 


159 


35 


2057 


gi22093906 


Oryzias latipes 


AF396668J choriogenin H 


157 


35 


2058 


gi62877 


Gallus gailus 


type VI collagen alpha-2 
subunit preprotein 


320 


42 


2058 


gi6288l 


Gallusgallus 


type VI collagen subunit 
alpha2 


320 


42 


2058 


gi62882 


Gallus gallus 


type VI collagen subunit 
alpha2 


320 


42 


2059 


gil7945608 


Drosophila 
melanogaster 


RE26969p 


600 


60 


2059 


gi7292879 


Drosophila 
melanogaster 


CG1998-PA 


600 


60 


2059 


gi7292910 


Drosophila 
melanogaster 


CG11162-PA 


423 


50 


2060 


gil7066106 


Homo sapiens 


Novex-3 Titin Isoform 


964 


99 


2060 


gi27696390 


Xenopus laevis 


Similar to titin 


251 


37 


2060 


gi992994 


Gallus gallus 


myosin light chain kinase 


228 


35 


2061 


gi 14089982 


Mycoplasma 
pulmonis 




143 


33 


2061 


gi2649941 


Archaeogiobus 
fulgidus DSM4304 




151 


30 


2061 


gi30 180922 


Nitrosomonas 
europaea ATCC 
19718 


Adenylate kinase 


143 


28 


2062 


gi29477024 


Mus musculus 


Similar to RIKEN cDNA 
9130023G24 gene 


464 


44 


2062 


gi3002588 


Mus musculus 


Plenty of SH3s; POSH 


148 


25 


2062 


gi7453547 


Homo sapiens 


glioma tumor suppressor 
candidate region protein 1 


125 


25 


2063 


gi29477024 


Mus musculus 


Similar to RIKEN cDNA 
9130023G24gene 


464 


44 


2063 


gi3002588 


Mus musculus 


Plenty of SH3s; POSH 


148 


25 


2063 


gi7453547 


Homo sapiens 


glioma tumor suppressor 
candidate region protein 1 


125 


25 


2064 


gil0441350 


Mus musculus 


olfactory UDP 
glucuronosyltransferase 


241 


70 


2064 


gi4580602 


Macaca fascicularis 


AF112112J UDP- 
glucuronosyltransferase 2B19 
precursor 


244 


73 


2064 


gi4753766 


Homo sapiens 


UDP glucuronosyltransferase 


266 


76 


2065 


gi 13325266 


Homo sapiens 


AAH04450 hypothetical 
protein MGC2650 


796 


91 


2065 


gi3688090 


Homo sapiens 


R32611 2 


827 


100 


2065 


gi6841228 


Homo sapiens 


AF161407 1HSPC289 


703 


84 


2066 


gi 11493483 


Homo sapiens 


AF130117 48 PRO2550 


282 


56 


2066 


gi3002527 


Homo sapiens 


neuronal thread protein AD7c- 
NTP 


497 


62 


2066 ■ 


gi32486167 


Homo sapiens 


AD7C-NTP 


497 


62 


2067 


gi 16552274 


Homo sapiens 


unnamed protein product 


276 


45 


2067 


gi57516 


Rattus rattus 


ASM 15 


437 


57 


2067 


gi7107346 


Peromyscus 
maniculatus bairdii 


H19 


280 


43 


2068 


gi20330550 


Homo sapiens 


AF251706J NK inhibitory 
receptor precursor 


1480 


94 


2068 


gi30962591 


Homo sapiens 


AF375480 1 immune receptor 


1401 


93 
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expressed on myeloid cells 
splice variant 1 






2068 


gi3 1790204 


Homo sapiens 


inhibitory receptor IREM1 


1478 


94 


2069 


gi20330550 


Homo sapiens 


AF251706J. NK inhibitory 
receptor precursor 


1480 


94 


2069 


gi30962591 


Homo sapiens 


AF375480_1 immune receptor 
expressed on myeloid cells 
splice variant 1 


1401 


93 


2069 


gi3 1790204 


Homo sapiens 


inhibitory receptor IREM1 


1478 


94 


2070 


gi20330550 


Homo sapiens 


AF25 1706 J NK inhibitory 
receptor precursor 


1480 


94 


2070 


gi30962591 


Homo sapiens 


AF375480_1 immune receptor 
expressed on myeloid cells 
splice variant 1 


1401 


93 


2070 


gi3 1790204 


Homo sapiens 


inhibitory receptor IREM1 


1478 


94 


2071 


gil8307481 


Homo sapiens 


phosphoinositide-binding 
proteins 


2206 


97 


2071 


gi27695704 


Mus musculus 


Connector enhancer of KSR2 


705 


35 


2071 


gi29691916 


Rattus norvegicus 


interactor protein for cytohesin 
exchange factors 1 


1651 


79 


2072 


gil 1493982 


Homo sapiens 


AF208232_i TLH29 protein 
precursor 


303 


70 


2072 . 


gil5929988 


Homo sapiens 


AAH 15423 Similar to TLH29 
protein precursor 


497 


100 


2072 


gi2 1618549 


Homo sapiens 


TLH29 protein precursor 


303 


70 


2073 


gil 1493982 


Homo sapiens 


AF208232_1 TLH29 protein 
precursor 


303 


70 


2073 


gil5929988 


Homo sapiens 


AAH15423 Similar to TLH29 
protein precursor 


497 


1 AA 

100 


M) 15 


glZlolo!)4y 


: 

Homo sapiens 


1 Liizy protein precursor 


mi 

JVJ 


7A 
/U 


A) /4 


gllZSU4oy.5 


Homo sapiens 


A A IJA1 77^3 CimJIof *« 

AAiiU 1 1 is similar to 
riDosomai proxein lo^- 


KQ 1 

jy i 


inn 


ZU /4 


gil /yjzyjo 


— : : 

Homo sapiens 


ribosomal protein L34 


jy i 


mn 

1UU 


jDJ /4 


glzUJU04J4 


Mus musculus 


1 1 nnnni roiBiLr nrntotn 
l luuuuiizzKiK protein 


JO / 


QQ 

yy 


jL\) 1 J 


gllJJ04o41 


Homo sapiens 


activating in k. receptor 


/jo 


QQ 

yy 


ZU fJ 


gil J J04o4j 


nomo sapiens 


in i D-rv receptor 


7^4 


1 no 


ZU Ij 


gizuyoouyy 


Mus musculus 


lympnocyte antigen iva 


OACi 


^Q 

yy 


Zt) to 


gllUl / /OZl 


Arabidopsis thaliana 


phytoene dehydrogenase-like 


^7^ 


AO 
*fZ 


zv /o 


m" 1 7Q7Q7^< 


Arabidopsis thaliana 


A 1 OgHiODU/JVOiVL 1 J_ IU 


Joy 


AO 1 
*tZ 


ZU/O 


m'7Gn707/H 

gl/VU/o /4Z 


Arabidopsis thaliana 


AID g4y J J U/ JVOM 1 3_ 1 \) 


^SQ 

JO? 


AO 
4Z 


2077 


gil4270364 


Mus musculus 


Epigen protein 


378 


71 


2077 


gio272zo9 


Rattus norvegicus 


NCI protein 


1 oo 

lzz 


jl 


OAT7 

2077 


gi7799191 


Mus musculus 


tomoregulin-1 


1 11 
IZZ 


<i 
3Z 


207 o 


gl 14270364 


Mus musculus 


Epigen protein 


37o 


71 


ZU /o 


rri£7777/CO 

gioz/zzoy 


Rattus norvegicus 


NCI protein 


179 
IZZ 


^7 
DZ 


2078 


gi7799191 


Mus musculus 


tomoregulin-1 


122 


52 


2079 


gi 14270364 


Mus musculus 


Epigen protein 


378 


71 


2079 


gi6272269 


Rattus norvegicus 


NCI protein 


122 


52 


2079 


gi7799191 


Mus musculus 


tomoregulin-1 


122 


52 


2080 


gi27469556 


Homo sapiens 


Putative neuronal cell adhesion 
molecule 


206 


34 


2080 


gi29289929 


Danio rerio 


neogenin 


176 


37 


2080 


gi3068592 


Mus musculus 


punc 


192 


35 


2081 


gi3 1753 150 


Homo sapiens 


Ras family member Ris 


665 


65 
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Mus musculus 


Co lo 


1 97A 
1Z /O 


04 


^AOI 

2081 


rriTll 1 1 OO 

gl/Jj 1 lz/ 


Homo sapiens 


Ar/jjjoo l isls 


OOj 


Oj 


2082 


* 1 1 1 oono c 

giiiizoyzj 


Homo sapiens 


Arjv4j/o_i ULdrz protein 


1 110 
LJ iZ 


QO 
yy 


2082 




Homo sapiens 


retinoic acia eariy rranscnpi i 


lO 1Z 


OO 

yy 


2082 


gi21961213 


Homo sapiens 


UL16 binding protein 2 


1 no 

1 J LZ 


QO 

yy 


2083 


gil3o7z813 


Homo sapiens 


fibulin-6 




OQ 

zy 


2083 


gi 14575679 


Homo sapiens 


AF156100 1 hemicentin 


513 


29 


2083 


gi92 80405 


Homo sapiens 


ArZ4jjlD 1 aalican 


1 A /CO 
140Z 


40 


2084 


gil3872813 


Homo sapiens 


fibulin-6 


513 


29 


2084 


gil4575679 


Homo sapiens 


Arl5olUU 1 hemicentin 


513 


zy 


2084 


gi9280405 


Homo sapiens 


AF245505 1 adiican 


1462 


46 


2085 


gil3872813 


Homo sapiens 


fibulin-6 


513 


29 


2085 


gil4575679 


Homo sapiens 


AF1 56100 1 hemicentin 


513 


29 


2085 


gi9280405 


Homo sapiens 


AF245505 1 adiican 


1462 


46 


2086 


gi3041867 


Homo sapiens 


p53 


162 


96 


2086 


gi4731632 


Homo sapiens 


AF135121JI tumor suppressor 
protein p53 


162 


96 


2086 


gi4732147 


Homo sapiens 


AF136271_1 tumor suppressor 
protein p53 


162 


96 


2087 


gi 12240284 


Mus musculus 


AF327059 1 apolipoprotein 
A5 


1300 


72 


2087 


gi6707433 


Homo sapiens 


AF202889 1 apolipoprotein 
A5 


1864 


100 


2087 


gi6707435 


Homo sapiens 


AF202890J apolipoprotein 
A5 


1864 


100 


2088 


gi 12240284 


Mus musculus 


AF327059_1 apolipoprotein 
A5 


1300 


72 


2088 


gi6707433 


Homo sapiens 


AF202889 1 apolipoprotein 
A5 


1864 


100 


2088 


gi6707435 


Homo sapiens 


AF202890 1 apolipoprotein 
A5 


1864 


i f\r\ 

100 


2089 


gil31 11784 


Homo sapiens 


AAH03081 hypothetical 
protein FLJ10637 


1509 


99 


2089 


gil3543037 


Mus musculus 


4933424B01Rik protein 


958 


OA 

80 


2089 


gil4249965 


Homo sapiens 


AAH08368 hypothetical 
protein FLJ10637 


1513 


100 


2090 


gi 19344001 


Homo sapiens 


phospholipase A2, group IID 


846 


99 


2090 


gi5771420 


Homo sapiens 


AF112982_1 group IID 
secretory phospholipase A2 


852 


100 


2090 


gi6453793 


Homo sapiens 


AF188625_1 phospholipase 
A2 


846 


99 


2091 


gil674069 


Mycoplasma 
pneumoniae 


30K adhesin-related protein 


132 


35 | 


2091 


gi 1684932 


Mycoplasma 
pneumoniae 


adhesin protein 


132 


35 


2091 


gi5114063 


Mycoplasma 
pneumoniae 


AF090172_1 revertant 
adhesin-related protein P30 


128 


35 


2092 


gill094019 


Homo sapiens 


AF305057 2 RTS beta 


2047 


94 


2092 


gil 150421 


Homo sapiens 


rTSbeta 


2053 


94 


2092 


gi 12654883 


Homo sapiens 


AAH01285 rTS beta protein 


2053 


94 


2094 


gi 13432042 


Homo sapiens 


integrin-linked kinase- 
associated serine/threonine 
phosphatase 2C 


2018 


100 


2094 


gi 16306907 


Homo sapiens 


AAH06576 integrin-linked 


2018 


100 
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kinase-associated 
serine/threonine phosphatase 
2C 






2094 


gi20072498 


Mus musculus 


0710007Al4Rik protein 


1935 


95 


2095 


gi 18490682 


TT * 

Homo sapiens 


fibulin 1 


281 


37 


2095 


gi28175169 


Mus musculus 


1300015B04Rik protein 


589 


74 


2095 


gi31419 


Homo sapiens 


fibulin- 1 C 


281 


37 


2096 


gil8480746 


Mus musculus 


olfactory receptor MOR261-10 


1336 


80 


2096 


gi21928655 


Homo sapiens 


seven transmembrane helix 
receptor 


1427 


90 


2096 


gi32052225 


Mus musculus 


olfactory receptor 
GA_x6K02T2P3E9-434 1246- 
4340281 


1336 


80 


2097 


gil8480746 


Mus musculus 


olfactory receptor MOR261-10 


1336 


80 


2097 


gi21928655 


Homo sapiens 


seven transmembrane helix 
receptor 


1427 


90 


2097 


gi32052225 


Mus musculus 


olfactory receptor 
GA x6K02T2P3E9-4341246- 
4340281 


1336 


80 


2098 


gi4760780 


Mus musculus 


Ten-m3 


401 


95 


2098 


gi5307761 


Danio rerio 


ten-m3 


347 


80 


2098 


gi6760369 


Mus musculus 


AF195418 1 ODZ3 


401 


95 


2099 


gi21 205852 


Homo sapiens 


AF385429_1 T-cell activation 
Rho GTPase activating protein; 
TA-GAP 


989 


100 


2099 > 


gi21410139 


Mus musculus 


T-cell activation Rho GTPase- 
activating protein 


813 


82 


2099 


gi24980955 


Mus musculus 


T-cell activation Rho GTPase- 
activating protein 


813 


82 


2100 


gi 1872200 


Homo sapiens 


alternatively spliced product 
using exon 13A 


242 


58 


2100 


gi3002527 


Homo sapiens 


neuronal thread protein AD7c- 
NTP 


283 


59 


2100 


gi32486167 


Homo sapiens 


A T*V?/—I V rrnr* 

AD7C-NTP 


283 


59 


2101 


gi 1872200 


Homo sapiens 


alternatively spliced product 
using exon 13A 


242 


58 


2101 


gi3002527 


Homo sapiens 


neuronal thread protein AD7c- 
NTP 


283 


59 


2101 


gi32486167 


Homo sapiens 


AD7C-NTP 


283 


59 


2102 


gi20196856 


Arabidopsis thaliana 


putative myosin heavy chain 


387 


47 


2102 


gi3 142302 


Arabidopsis thaliana 


Z34293 from A. thaliana. 


389 


47 


2102 


gi532124 


Dictyostelium 
discoideum 


myosin IC 

* 


388 


46 


2103 


gi20 196856 


Arabidopsis thaliana 


putative myosin heavy chain 


387 


47 


2103 


gi3 142302 


Arabidopsis thaliana 


Z34293 from A. thaliana. 


389 


47 




gljjZlZH 


Dictyostelium 
discoideum 


myosin IC 


1QQ 
JOO 


40 


2104 


gi29564894 


Homo sapiens 


unnamed protein product 


174 


39 


2104 


gi3002527 


Homo sapiens 


neuronal thread protein AD7c- 
NTP 


174 


39 


2104 


gi32486167 


Homo sapiens 


AD7C-NTP 


174 


39 


2105 


gi21265163 


Homo sapiens 




1893 


95 


2105 


gi7248845 


Homo sapiens 


AF231124 1 testican-l 


1893 


95 


2105 


gi793845 


Homo sapiens 


testican 


1893 


95 


2106 


gi 12804465 


Homo sapiens 


AAH01639 prostate cancer 


686 


66 
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overexpressed gene 1 






2106 


gi20380774 


Homo sapiens 




1098 


99 


2106 


gi3462515 


Homo sapiens 


PB39 


686 


66 


2107 


gi!2804465 


Homo sapiens 


AAH01639 prostate cancer 
overexpressed gene 1 


686 


66 


2107 


gi20380774 


Homo sapiens 




1098 


99 I 


2107 


gi3462515 


Homo sapiens 


PB39 


686 


66 


2108 


gil7391348 


Homo sapiens 


AAH18615 Similar to brain 
expressed, X-linked 1 


664 


100 


2108 


gi7689029 


Homo sapiens 


AF220189_1 uncharacterized 
hypothalamus protein HBEX2 


664 


100 


2108 


gi9963771 


Homo sapiens 


AF183416_l ovarian granulosa 
cell 13.0 kDa protein hGR74 
homolog 


664 


100 


2109 


gi26353296 


Mus musculus 


unnamed protein product 


711 


76 


2109 


gi28799187 


Homo sapiens 


unnamed protein product 


1463 


98 


2109 


gi30908853 


Homo sapiens 


synleurin 


1463 


98 


2111 


gi20988071 


Mus musculus 


260001 lE07Rik protein 


445 


89 


2111 


gi23274133 


Homo sapiens 


Similar to serine/arginine 
repetitive matrix 1 


161 


27 


2111 


gi3 153821 


Mus musculus 


plenty-of-proiines- 101; 
POP101; SH3-philo-protein 


164 


30 


2112 


gi9651079 


Macaca fascicularis 


hypothetical protein 


291 


75 


2113 


gil2408272 


Homo sapiens 


apolipoprotein L-IV splice 
variant a 


1726 


99 


2113 


gi 12408286 


Homo sapiens 


apolipoprotein L-IV splice 
variant a 


1726 


99 


2113 


gi 13374351 


Homo sapiens 


AF305226 1 apolipoprotein 
L4 


1709 


98 


2114 


gil2408272 


Homo sapiens 


apolipoprotein L-IV splice 
variant a 


1726 


99 


2114 


gi 12408286 


Homo sapiens 


apolipoprotein L-IV splice 
variant a 


1726 


99 


2114 


gil3374351 


Homo sapiens 


AF305226 1 apolipoprotein 
L4 


1709 


98 


2115 


gi2 1744725 


Homo sapiens 


AF478693J glycosyl- 
phosphatidyl-inositol-MAM 


717 


97 


2115 


gi25005318 


Sus scrofa 


MAM domain containing 
glycosylphosphatidylinositol 
anchor 1 


672 


91 


2115 


gi25005320 


Sus scrofa 


glycosylphosphatidylinositol 
anchor 1 protein 


672 


91 


2116 


gi21744725 


Homo sapiens 


AF478693J glycosyl- 
phosphatidyl-inositol-MAM 


717 


97 


2116 


gi250053l8 


Sus scrofa 


MAM domain containing 
glycosylphosphatidylinositol 
anchor I 


672 


91 


2116 


gi25005320 


Sus scrofa 


glycosylphosphatidylinositol 
anchor 1 protein 


672 


91 


2117 


gil6769264 


Drosophila 

melanogaster ! 


LD21615p 


219 


40 


2117 


gi7290426 


Drosophila 
melanogaster 


CG2875-PB 


219 


40 


2117 


gi7290427 


Drosophila 
melanogaster 


CG2875-PA 


219 


40 
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2118 


gi23273399 


Homo sapiens 




963 


100 


2118 


gi25059032 


Mus musculus 




686 


72 


2118 


gi28385965 


Mus musculus 


Similar to phospholipase A2 


488 


77 


2119 


gi23273399 


Homo sapiens 




963 


100 


2119 


gi25059032 


Mus musculus 




686 


72 


2119 


gi28385965 


Mus musculus 


Similar to phospholipase A2 


488 


77 


2120 


gil3562004 


Nephila 

madagascariensis 


AF350276_1 major ampullate 
spidroin 2-like protein 


228 


27 


2120 


gil3562008 


Nephila 

madagascariensis 


AF350278_1 major ampullate 
spidroin 2 


238 


29 


2120 


gii59714 


Nephila clavipes 


dragline silk fibroin 


224 


29 


2121 


gil3161409 


Mus musculus 


family 4 cytochrome P450 


445 


76 


2121 


gil3 182964 


Mus musculus 


AF233643 1 cytochrome P450 
CYP4F13 


191 


38 


2121 


gi 13278244 


Mus musculus 


cytochrome P450, family 4, 
subfamily f, polypeptide 13 


191 


38 


2122 


gil0944887 


Homo sapiens 


FGFR-like protein 


1858 


97 


2122 


gil3183618 


Homo sapiens 


AF3 12678_1 FGF homologous 
factor receptor 


1807 


96 


2122 


gi 13447749 


Homo sapiens 


AF279689_1 fibroblast growth 
factor receptor 5 


1858 


97 


2123 


gil0944887 


Homo sapiens 


FGFR-like protein 


1858 


97 


2123 


gi!3183618 


Homo sapiens 


AF312678J FGF homologous 
factor receptor 


1807 


96 


2123 


gil3447749 


Homo sapiens 


AF279689J fibroblast growth 
factor receptor 5 


1858 


97 


2124 


gi 10944887 


Homo sapiens 


FGFR-like protein 


1858 


97 


2124 


gil3183618 


Homo sapiens 


AF3 12678_1 FGF homologous 
factor receptor 


1807 


96 


2124 


gi 13447749 


Homo sapiens 


AF279689J fibroblast growth 
factor receptor 5 


1858 


97 


2125 


gil2667454 


Rattus norvegicus 


AF336858 1 synaptotagmin 
VIIc 


949 


88 


2125 


gi 12667456 


Rattus norvegicus 


AF336859 1 synaptotagmin 
VHd 


949 


88 


2125 


gi 12667458 


Rattus norvegicus 


AF336860 1 synaptotagmin 
Vile 


949 


88 


2126 


gi 12053709 


Homo sapiens 


with thrombospondin type 1 
motif, 12 


1143 


98 


2126 


gi278 17773 


Mus musculus 


metalloprotease disintegrin 12 
protein 


873 


76 


2126 


gi5923788 


Homo sapiens 


AF140675J zinc 
metalloprotease ADAMTS7 


271 


39 


2127 


gi 11493982 


Homo sapiens 


AF208232_1 TLH29 protein 
precursor 


303 


70 


2127 


gi 15929988 


Homo sapiens 


AAH15423 Similar to TLH29 
protein precursor 


497 


100 


2127 


gi21618549 


Homo sapiens 


TLH29 protein precursor 


303 


70 


2128 


gil7391206 


Mus musculus 


RIKEN cDNA 2210412D01 


1267 


99 


2128 


gi23468210 


Homo sapiens 


Similar to CGI-67 protein 


1096 


81 


2128 


gi9368522 


Homo sapiens 


CGI-67 protein 


1267 


99 


2129 


gi 1739 1206 


Mus musculus 


RIKEN cDNA 2210412D01 


1267 


99 


2129 


gi23468210 


Homo sapiens 


Similar to CGI-67 protein 


1096 


81 


2129 


gi9368522 


Homo sapiens 


CGI-67 protein 


1267 


99 
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2130 


gi20071312 


Mus musculus 


4933425r03Kik protein 


614 


85 


2130 


gi33391740 


Homo sapiens 


MGC45780 


426 


96 


2130 


gi735 


Bos taurus 


scavenger receptor type I 


336 


51 


2131 


gi20071312 


Mus musculus 


4933425F03Rik protein 


614 


85 


2131 


gi33391740 


Homo sapiens 


MGC45780 


426 


96 


2131 


gi735 


Bos taurus 


scavenger receptor type I 


336 


51 


2132 


gi5870866 


Homo sapiens 


TATA element modulatory 
factor 


4531 


99 


2132 


gi6650548 


Rattus norvegicus 


AF107843_1 TATA element 
modulatory factor 


2583 


82 


2132 


gi7290766 


Drosophila 
melanogaster 


CG4557-PA 


692 


25 


2133 


* 1 AO r\ 1 A c 

gi 1020145 


Homo sapiens 


DNA binding protein 


1483 


An 

43 


2133 


gi 18643 896 


Homo sapiens 


zinc finger protein 


I486 


43 


2133 


gi29476835 


Homo sapiens 




I486 


43 


2134 


gil6198520 


Homo sapiens 


Saccharomyces cerevisiae 
Nip7p homolog 


944 


100 


2134 


gi4680713 


Homo sapiens 


AF132971__1 CGI-37 protein 


944 


100 


2134 


gi51 14055 


Homo sapiens 


HSPC031 


f\ A A 

944 


1 Art 

100 


2135 


gi23274241 


Homo sapiens 


KIAAl892-like 


563 


86 


2135 


gi26332114 


Mus musculus 


unnamed protein product 


577 


89 


2135 


gi26345386 


Mus musculus 


unnamed protein product 


577 


89 


2136 


gil5620885 


Homo sapiens 


KIAA1913 protein 


1627 


99 


2136 


gi26339494 


Mus musculus 


unnamed protein product 


1480 


90 


2136 


gi28279830 


Homo sapiens 


KIAA1913 protein 


1598 


99 


2137 


gi 1000448 


Rattus norvegicus 


Rat kidney AGT2 precursor 


I578 


84 


2137 


gi 12406973 


Homo sapiens 


alanine-gl yoxylate 
aminotransferase 2 


1 865 


98 


2137 


gil944136 


Rattus norvegicus 


beta-alanine-pyruvate 
aminotransferase 


1625 


85 


2138 


gi 1000448 


Rattus norvegicus 


Rat kidney AGT2 precursor 


1578 


84 


2138 


gi 12406973 


Homo sapiens 


alanine-glyoxylate 
aminotransferase 2 


1865 


98 


2138 


gil944136 


Rattus norvegicus 


beta-alanine-pyruvate 
aminotransferase 


1625 


85 


2139 


gi29436673 


Mus musculus 


l700049K14Rik protein 


648 


100 


2139 


gi4204421 


Euroglyphus maynei 


group 3 allergen Eur m 3 0101 
precursor 


212 


40 


2139 


gi5441861 


Paraiichthys 
olivaceus 


chymotrypsinogen 2 


210 


36 


2140 


gil7985046 


Brucella melitensis 
16M 


GLYCOSYL TRANSFERASE 


130 


28 


2140 


gi205 15259 


Thermoanaerobacter 
tengcongensis 


predicted glycosyltransferases 


133 


32 


O 1 A(\ 


gl440j /JO 


Streptomyces 
coelicolor A3(2) 


putative transferase 


140 


32 


2141 


gi 13649477 


Homo sapiens 


AF250309J putative cytokine 
receptor CRL4 precusor 


2694 


100 


2141 


gi30584223 


synthetic construct 


Homo sapiens interleukin 17B 
receptor 


2694 


100 


2141 


gi9246433 


Homo sapiens 


AF2081HM IL-17 receptor 
homolog precursor 


2688 


99 


2142 


gi 18676472 


Homo sapiens 


FU00133 protein 


855 


76 


2142 


gi29568116 


Mus musculus 


secreted protein SST3 


725 


64 
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2142 


gi499686 


Heliocidaris 
erythrogramma 


fibropellin la 


390 


40 


2143 


gil6588687 


Homo sapiens 


AF315687J S- 
adenosyihomocysteine 
hydrolase-like protein 


147 


100 


2143 


gi27692283 


Mus musculus 


S-adenosylhomocysteine 
hydrolase-like 1 


147 


100 


2143 


gi2852125 


Homo sapiens 


S-adenosyl homocysteine 
hydrolase homolog 


147 


100 


2144 


gi 16740861 


Homo sapiens 


AAH16292 ubiquitin- 
conjugating enzyme E2C 


521 


66 


2144 


gi29791813 


Homo sapiens 


Ubiquitin-conjugating enzyme 
E2C, isoform 1 


521 


66 


2144 


gi30583439 


Homo sapiens 


ubiquitin-conjugating enzyme 
E2C 


CO t 

521 


66 


2145 | 


gi200865l6 


Homo sapiens 


AF245303_1 prommm-2 
variant A 


Oil OA 

24 oU 




2145 


gi20086518 


Homo sapiens 


AF245304_1 promimn-2 
variant B 


ia on 


yi 


O 1 A C 

2145 


giZ4o3/joo 


Rattus norvegicus 


prominin-2 


15/0 


Oo 


2146 


gi29351676 


Homo sapiens 


Angiopoietin-like 5 




yy 


2146 


gi29468510 


Homo sapiens 


putative fibrinogen-like protein 




oo 
yy 


o 1 A ez 

2146 


gl922990o 


Ciona intestinalis 


fibrinogen-like protein 




oy 


2147 


gi29351676 


Homo sapiens 


Angiopoietin-like 5 


1310 


99 


2147 


gi29468510 


Homo sapiens 


putative fibrinogen-like protein 


1305 


99 


2147 


gi9229906 


Ciona intestinalis 


fibrinogen-like protein 


392 


39 


2148 


gi29351676 


Homo sapiens 


Angiopoietin-like 5 


1310 


99 


2148 


gi294685l0 


Homo sapiens 


putative fibrinogen-like protein 


1305 


99 


2148 


gi9229906 


Ciona intestinalis 


fibrinogen-like protein 


392 


39 


2150 


gil3543706 


Homo sapiens 


AAH06003 


349 


100 


2150 


gi2098806l 


Mus musculus 


!8100l3DlORik protein 


333 


92 


2150 


gi2!6l9079 


Homo sapiens 




349 


100 


2151 


gil 1493652 


Homo sapiens 


AF200708_l calcium channel 
blocker resistance protein 
CCBR1 


2168 


100 


'2151 


gil 3924720 


Homo sapiens 


AF252872_l cystine/glutamate 
transporter xCT 


2168 


100 


2151 


gi!5082352 


Homo sapiens 


AAH12087 member ll 


2168 


100 


2152 


gil 80432 14 


Mus musculus 


serine/arginine-rich protein 
specific kinase 2 


132 


67 


2152 


gi23270876 


Homo sapiens 


Similar to SFRS protein kinase 
2 


132 


67 


2152 


gi3406050 


Homo sapiens 


serine kinase SRPK2 


132 


67 


2153 


gi22164066 


Homo sapiens 


AF388385_i neuroblastoma- 
amplified protein 


4284 


99 


2153 


gi30353863 


Homo sapiens 


NAG protein 


4298 


99 


2153 


gi4337460 


Homo sapiens 


neuroblastoma-amplified 
protein 


4272 


99 


2154 


gi22164066 


Homo sapiens 


AF388385_l neuroblastoma- 
amplified protein 


4284 


99 


2154 


gi30353863 


Homo sapiens 


NAG protein 


4298 


99 


2154 


gi4337460 


Homo sapiens 


neuroblastoma-amplified 
protein 


4272 


99 


2155 


gi 1008367 


Saccharomyces 
cerevisiae 


CPSl 


131 


48 
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2155 


gi3594 


Saccharorayces 
cerevisiae 


carboxypeptidase s 


131 


48 


2155 


gi3596 


Saccharomyces 
cerevisiae 


carboxypeptidase yscS 


131 


48 


2156 


gil 1558029 


Homo sapiens 


organic cation transporter i 


1876 


100 


2156 


gil8088251 


Homo sapiens 


AAH20565 Similar to hBOIT 
for potent brain type organic 
ion transporter 


1838 


95 


2156 


gi9663117 


Homo sapiens 


organic cation transporter 


1868 


99 


2157 


gi21 732438 


Homo sapiens 


hypothetical protein 


567 


100 ; 


2157 


gi26330392 


Mus musculus 


unnamed protein product 


486 


85 


2157 


gi26390211 


Mus musculus 


unnamed protein product 


486 


85 


2158 


gi23893591 


Human herpesvirus 4 


BHLF1 early reading frame 


169 


28 ! 


2158 


gi30844300 


Cercopithecine 
herpesvirus 1 


immediate early protein ICP0 


166 


23 


2158 


gi30844317 


Cercopithecine 
herpesvirus 1 


immediate early protein ICP0 


166 


23 


2159 


gi27804346 


Homo sapiens 


BRD4-NUT fusion 
oncoprotein 


3773 


99 


2159 


gi31 15204 


Homo sapiens 


HUNKI 


3787 


99 


2159 


gi3 184498 


Homo sapiens 


R31546 1 


3837 


99 


2160 


gil5420832 


Homo sapiens 


AF397394 1 NOE3-3 


535 


96 


2160 


gi 15420834 


Homo sapiens 


AF397395 1 NOE3-4 


535 


96 


2160 


gil8490927 


Homo sapiens 


olfactomedin 3 


531 


95 


2161 


gi22209078 


Homo sapiens 


hypothetical protein 
DKFZp566D234 


773 


98 


2161 


gi6330966 


Homo sapiens 


KIAA1263 protein 


773 


98 


2161 


gi6808053 


Homo sapiens 


hypothetical protein 


766 


97 


2162 


gi 12654031 


Homo sapiens 


AAH00819 Similar to CG6950 
gene product 


158 


93 


2162 


gi21707106 


Homo sapiens 




120 


56 


2162 


gi75859l 


Homo sapiens 


glutamine-phenylpyruvate 
aminotransferase 


120 


56 


2163 


gi2 1666433 


Mus musculus 


AF404775_1 actin-binding 
LIM protein 1 medium isoform 


302 


54 


2163 


gi2337952 


Homo sapiens 


actin-binding double-zinc- 
finger protein 


303 


54 


2163 


gi30259308 


Mus musculus 


actin-binding LIM protein 2 


498 


79 


2164 


gi2062399 


Rattus norvegicus 


protein serine/threonine kinase 
CPG16 


404 


50 


2164 


gi6716518 


Mus musculus 


AF1551 doublecortin-like 
kinase 


404 


50 


2164 


gi67 16522 


Mus musculus 


AF155821 1 CPG16 


404 


50 


2165 


gi2062399 


Rattus norvegicus 


protein serine/threonine kinase 
CPG16 


404 


50 


2165 


gi6716518 


Mus musculus 


AF1551 doublecortin-like 
kinase 


404 


50 


2165 


gi6716522 


Mus musculus 


AF155821 1 CPG16 


404 


50 


2166 


gil3436035 


Mus musculus 


prostaglandin E synthase 2 


1321 


87 


2166 


gi29179467 


Danio rerio 


Similar to prostaglandin E 
synthase 2 


988 


66 


2166 


gi9280108 


Macaca fascicularis 


membrane-associated 
prostaglandin E synthase-2 


1449 


97 


2167 


gi!2805247 


Mus musculus 


Complement component 1, q 
subcomponent, alpha 


955 


70 | 
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polypeptide 






2167 


gi209888Q5 


Homo sapiens 


complement component 1, q 
subcomponent, alpha 
polypeptide 


mo 

1318 


100 


O 1 CI 

2167 


! A QC\A OCA 

gl4894854 


Homo sapiens 


AT711C 1M 1 t a. m - 

AF135157_1 complement Clq 
A chain precursor 


1318 


1 r\f\ 

100 


2168 


gil491621 


Bovine herpesvirus 1 


UL36 


126 


38 


2168 


gil5 145795 


r» £> 

Sus scrota 


basic proline-rich protein 


123 


38 


2168 


gi2653311 


Bovine herpesvirus 
type 1.1 (strain 
Cooper) 




126 


38 


2169 


gi21707458 


Homo sapiens 


PAX transcription activation 
domain interacting protein 1 
like 


2470 


81 


2169 


gi2565046 


Homo sapiens 


CAGF28 


3770 


97 


2169 


gi4336734 


Mus musculus 


Pax transcription activation 
domain interacting protein PTIP 


2945 


70 


o 1 Try 

2170 


gi2 170745 8 


Homo sapiens 


PAX transcription activation 
domain interacting protein 1 
nice 


2470 


81 


2170 


gi2565046 


Homo sapiens 


CAGF28 


3770 


97 1 


2170 


gi4336734 


Mus musculus 


Pax transcription activation 
domain interacting protein PTIP 


2945 


70 


oi 7i 

Zl / 1 


rrilO/1 CC71 Q 

glJZ4oo / lo 


— : 

Oryza sativa 
(japonica cultivar- 
group) 


UijJNBaUUoonOy. iy 


121 


A 1 

41 


2172 


gi26353296 


Mus musculus 


unnamed protein product 


711 


76 


01 70 
Z 1 /Z 


rriOS7QQ1 Q7 


Homo sapiens 


unnamed protein product 


1 A CI 

14oj 


no 

yo 


2172 


gi30908853 


Homo sapiens 


synleurin 


1463 


98 


2173 


gil3991167 


Homo sapiens 


sialic acid-binding 
immunoglobulin-like lectin-like 
long splice variant 


1231 


99 


O T Tl 
Zl / j 


gll4oz3ozZ 


Homo sapiens 


AF282256_l Siglec-Ll 


1231 


99 


Ol OI 

Zl /J 


mOIOOOO/CO. 

giziz/z/oy 


Homo sapiens 


SlGLEC-hke 1 


1231 


99 


1 1 1A 

Zl /4 


gilj43;>47o 


Mus musculus 


DNA segment, Chr 10, 
University of California at Los 
Angeles 1 


1206 


91 


01 O/l 
Zl /4 


gizoz/yjjj 


— — : : 

Danio rerio 


similar to UNA segment, Chr 
10, University of California at 
Los Angeles 1 


ore 

865 


69 


Zl /*t 


oiOQ1 AAQSI 

gizy i^'tyo j 


Mus musculus 


uin a segment, cnr o, uka i u 
Doi 253, expressed 


OOO 


Of 


017^ 


m07Q0zl1 no 
giz /y&H iuz 


— : 

Mus musculus 


Oil n(\K\A1 <T>ih' nrAtoin 

Z3 iuu /jMiDKiK protein 


QAA 

y44 


/TO 

DO 


2175 


gi29436830 


Mus musculus 


23 10075M15Rik protein 


944 


68 


0 1 o^ 
Zl /j 


gioz/jjyy 


Homo sapiens 


AF200348_l melanoma- 
associated an tiff en MG50 


940 


67 


2176 


gi27924102 


Mus musculus 


23 10075M15Rik protein 


944 


68 [ 


2176 


gi29436830 


Mus musculus 


23 10075M15Rik protein 


944 


68 


2176 


gi6273399 


Homo sapiens 


AF200348J melanoma- 
associated antigen MG50 


940 


67 


2177 


gi27924102 


Mus musculus 


23 10075M15Rik protein 


944 


68 


2177 


gi29436830 


Mus musculus 


23 10075M15Rik protein 


944 


68 


2177 


gi6273399 


Homo sapiens 


AF200348J melanoma- 
associated antigen MG50 


940 


67 


2178 


gi 11493483 


Homo sapiens 


AF130117_48 PRO2550 


220 


56 
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2178 


gil872200 


Homo sapiens 


alternatively spliced product 
using exon 13 A 


220 


51 


2178 


gi8572229 


Homo sapiens 


ubiquitous TPR-motif protein 
Y isoform 


217 


53 


2179 


gi6808611 


Homo sapiens 


AF204231_1 88-kDaGolgi 
protein 


3209 


97 


2179 


gi6969980 


Homo sapiens 


AF163441 1 golgin67 


2339 


98 


2179 


gi7211438 


Homo sapiens 


AF164622J golgin-67 


2321 


97 


2180 


gil5030299 


Mus musculus 


protein kinase, cAMP 
dependent regulatory, type I 
beta 


1881 


94 


2180 


gi200365 


Mus musculus 


cAMP-dependent protein 
kinase regulatory subunit 


1886 


94 


2180 


gi307377 


Homo sapiens 


cAMP-dependent protein 
kinase Rl-beta regulatory 
subunit 


1957 


99 


2181 


gi 10945428 


Homo sapiens 


membrane-associated 
guanylate kinase MAGI3 


156 


41 


2181 


- -J AAA/% Art A 

gil2003994 


Homo sapiens 


AF213259_1 membrane- 
associated guanylate kinase- 
related MAGI-3 


156 


41 


1 1 O 1 

2181 


gi7650497 


Rattus norvegicus 


AF255614_1 scaffoldmg 
protein SLIPR 


156 


41 


2182 


gi 1845577 


Mus musculus 


-lipoxygenase 


2559 


74 


2182 


gi30047223 


Mus musculus 


Arachidonate lipoxygenase, 
epidermal 


2557 


74 




m*Q£/f CO 1 1 

giio4jy 16 




Mus musculus 


-lipoxygenase 


2559 


74 




gilo4Dj/7 


Mus musculus 


-lipoxygenase 


2559 


74 


1 1 Q1 


gl3UU4/2Zj 


Mus musculus 
-— 


Arachidonate lipoxygenase, 
epidermal 


2557 j 


74 




gijo4jy i j 


Mus musculus 


-lipoxygenase 


2559 


—1 A 

74 


2184 


gi 1845577 


Mus musculus 


-lipoxygenase 


2559 


74 


ZLo4 


glJUU4 12.15 


Mus musculus 


Arachidonate lipoxygenase, 
epidermal 


2557 


74 


01 QA 


rti'l^/l^Oll 

gijo4oy 13 


Mus musculus 


-lipoxygenase 


2559 


74 




rri 1 (\A1QA9.< 


Homo sapiens 


unnamed protein product 


AO 1 

481 


on 

87 


2185 


gil2853469 


Mus musculus 


unnamed protein product 


395 


62 


1 1 Of 

2185 


gi 1 8027736 


Homo sapiens 


AF3 18322 1 unknown 


330 


50 


2186 


* 1/11 AOOA*7 

gil4198207 


Mus musculus 


hypothetical protein BC008163 


1599 


98 


2186 


gi 19343692 


Homo sapiens 




1625 


100 


2186 


gi7294965 


Drosophila 
melanogaster 


CG4452-PA 


615 


40 


2192 


gi22209089 


Homo sapiens 


Similar to vesicular inhibitory 
amino acid transporter 


308 


98 


2192 


gi30354125 


Mus musculus 


Viaat protein 


308 


98 




dill ^^6^09 


nomo sapiens 


Vesicular inhibitory amino acid 
transporter 




ys 


2193 


gi22507470 


Mus musculus 


AI413481 protein 


997 


92 


2193 


gi3097285 


Rattus norvegicus 


ZOG 


481 


48 


2193 


gi802014 


Rattus norvegicus 


preadipocyte factor 1 


481 


48 


2194 


gil488314 


Homo sapiens 


hepatitis delta antigen 
interacting protein A 


442 


49 


2194 


gi 18088059 


Mus musculus 


E030025D05Rik protein 


1622 


83 


2194 


gi6624073 


Homo sapiens 


AC007743_1 similar to 
hepatitis delta antigen 


1903 


94 
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interacting protein A 






2195 


gi 14250638 


Homo sapiens 


AAH08783 Similar to DNA 
segment, Chr 17, human 
D6S54E 


1886 


99 


2195 


gi3941733 


Mus musculus 


AAC82476 BAT4 


1453 


76 


2195 


gi4337106 


Homo sapiens 


AAD18082 BAT4 


1886 


99 


2196 


gi 15277895 


Homo sapiens 


AAH12939 Similar to 
cardiotrophin-like cytokine; 
neurotrophin- 1/B-cell 
stimulating factor-3 


1226 


100 


2196 


gil6356643 


Homo sapiens 


cardiotrophm-like cytokine 


1226 


100 


2196 


gi6007643 


Homo sapiens 


neurotrophin- 1/B-cell 
stimulating factor-3 


1226 


100 


2197 


gi 15982236 


Mus musculus 


putative methionyl 
ammopeptidase 


1069 


92 


2197 


gi23306398 


Arabidopsis thaliana 


, putative 


739 


50 


2197 


gi2489977l 


Arabidopsis thaliana 


, putative 


739 


50 


2198 


gi 13592 175 


Leishmania major 


AC084329 1 ppg3 


196 


24 


2198 


gi28828l84 


Dictyostelium 
discoideum 


similar to Leishmania major. 
Ppg3 


180 


24 


2198 


gi5420387 


Leishmania major 


proteophosphoglycan 


202 


24 


2199 


gil9387l36 


Homo sapiens 


AF479748J PYRIN- 
containing APAFl-hke protein 
5 


4151 


91 


2199 


gi2 1410402 


Mus musculus 


PYRlN-containing APAFl-like 
protein 5 


1191 


54 


O 1 AA 

2199 


gi28436366 


Homo sapiens 


NALP6 


4151 


91 


2200 


gil 1321325 


Homo sapiens 


AF3 11862 1 Lin-7b 


684 


98 


2200 


gi20381193 


Homo sapiens 


Lm-7b protein; likely ortholog 
of mouse LIN-7B; mammalian 
LIN-7 protein 2 


684 


98 


OOAA 

22UU 


giJooDo2o 


Rattus norvegicus 


hn-7-A 


673 


96 


A1 

2201 


gl 14349 1 25 


Homo sapiens 


alpha2-glucosyltransferase 


567 


97 


2201 


gi32490259 


Oryza sativa 
(japonica cultivar- 
group) 


OSJNBb0116K07.1 


181 


46 


2201 


gUJ 13451 


Rattus norvegicus 


potassium channel regulator 1 


549 


96 


2202 


gllj325140 


Homo sapiens 


AAH04383 


2693 


100 


2202 


gi35768 


Homo sapiens 


polypirimidine tract binding 
protein 


2693 


100 


2202 


gi35774 


Homo sapiens 




2693 


100 


2203 


gi2 1522776 


Homo sapiens 


unnamed protein product 


2998 


98 


2203 


gi24047224 


Homo sapiens 


Similar to EGF-like-domain, 
multiple 6 


2982 


98 


2203 


gi6752658 


Homo sapiens 


AF1 86084_1 epidermal growth 
factor repeat containing protein 


2984 


98 


2204 


gi21522776 


Homo sapiens 


unnamed protein product 


2998 


98 


2204 


gi24047224 


Homo sapiens 


Similar to EGF-like-domain, 
multiple 6 


2982 


98 


2204 


gi6752658 


Homo sapiens 


AF186084_1 epidermal growth 
factor repeat containing protein 


2984 


98 


2205 


gil 1385648 


Homo sapiens 


AF273045J CTCL tumor 
antigen se 14-3 


3622 


95 


2205 


gi 17980969 


Homo sapiens 


AF454056J sel4-3r protein 


3858 


95 


2205 


gi29165763 


Mus musculus 


3632413B07Rik protein 


3261 


75 
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2206 


gi 11385648 


Homo sapiens 


AF273045J CTCL tumor 
antigen se!4-3 


3622 


95 


2206 


gi 17980969 


Homo sapiens 


AF454056J sel4-3r protein 


3858 


95 


2206 


gi29165763 


Mus musculus 


3632413B07Rik protein 


3261 


75 


2207 


gil 1385648 


Homo sapiens 


AF273045J CTCL tumor 
antigen sel4-3 


3622 


95 


2207 


gi 17980969 


Homo sapiens 


AF454056_1 sel4-3r protein 


3858 


95 


2207 


gi29165763 


Mus musculus 


3632413B07Rik protein 


3261 


75 


2208 


gill385648 


Homo sapiens 


AF273045_1 CTCL tumor 
antigen sel4-3 


3622 


95 


2208 


gi 17980969 


Homo sapiens 


AF454056_1 sel4-3r protein 


3858 


95 


2208 


gi29 165763 


Mus musculus 


3632413B07Rik protein 


3261 


75 


2209 


gil4043211 


Homo sapiens 


AAH07594 Similar to RIKEN 
cDNA 4931428F04 gene 


975 


97 


2209 


gi21750866 


Homo sapiens 


unnamed protein product 


975 


97 


2209 


gi25058997 


Mus musculus 


1110003N12Rik protein 


641 


62 


2210 


gil9387136 


Homo sapiens 


AF479748J PYRIN- 
containing APAFl-like protein 
5 


3078 


100 


2210 


gi202806 


Rattus norvegicus 


vasopressin receptor 


969 


67 


2210 


gi28436366 


Homo sapiens 


NALP6 


3078 


100 


2211 


gi 13 157560 


Homo sapiens 




2246 


99 


2211 


gil8147612 


Homo sapiens 


metalloprotease disintegrin 


2246 


99 


2211 


gi2 1908030 


Homo sapiens 


a disintegrin and 
metalloprotease domain 33 


2230 


98 


2212 


gil3592175 


Leishmania major 


AC084329J ppg3 


163 


34 


2212 


gil5145803 


Chlamydomonas 
reinhardtii 


hydroxyproline-rich 
glycoprotein VSP4 


150 


28 


2212 


gi5420387 


Leishmania major 


proteophosphoglycan 


157 


32 


2213 


gil5420879 


Mus musculus 


AF39897M ankyrin repeat- 
containing SOCS box protein 
10 


1986 


83 


2213 


gil8031949 


Mus musculus 


SOCS box protein ASB-18 


808 


44 


2213 


gi 18092200 


Homo sapiens 


AF4 17920 1 ASB-10 


2062 


91 


2214 


gi32707 


Homo sapiens j 


interferon-omega 1 


331 


51 


2214 


gi386800 


Homo sapiens 


interferon-alpha 


334 


51 


AO I A 

2214 


gi49 1284 


synthetic construct 


IFN-pseudo-omega 2 


806 


99 


2215 


gi6841550 


Homo sapiens 


AF161513 1 HSPC164 


1594 


99 


2215 


gi6841560 


Homo sapiens 


AF161518 1 HSPC169 


1604 


100 


2215 


gi9844577 


Homo sapiens 




1601 


99 


2216 


gil 1493483 


Homo sapiens 


AF130117 48 PRO2550 


408 


79 


2216 


gil 872200 


Homo sapiens 


alternatively spliced product 
using exon 13A 


352 


74 


2216 


gi7020440 


Homo sapiens 


unnamed protein product 


396 


76 




swIOI/ZCO A 1 0 

gl220Do41o 


Mus musculus 


_r»\r a _ _______ n/"*t/\i /\t\^ a 

cDNA sequence BC030934 


365 


71 


2217 


gi28838433 


Homo sapiens 


DKFZp762A2013 protein 


443 


87 


2217 


gi30842594 


Homo sapiens 


putative sulfhydryl oxidase 
precursor 


360 


74 


2218 


gi 12958660 


Homo sapiens 


AF3 2 1 9 1 8_ 1 acid phosphatase 


573 


89 


2218 


gil2958663 


Homo sapiens 


AF321918_4 acid phosphatase 
variant 3 


573 


89 


2218 


gi202934 


Rattus norvegicus 




207 


43 


2219 


gil5866260 


Homo sapiens 


AF411132 1 MRIP2 


2479 


97 


2219 


gi29476839 


Homo sapiens 


Similar to centaurin, gamma 2 


2124 


98 
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2219 


gl3U354556 


Homo sapiens 


iviJxiJrz proiein 


ZhOO 


07 

y / 


2220 


gl 15 866260 


Homo sapiens 


ATJ/111110 1 1V/TPTP9 
Ar411ljZ 1 JYLtvlrZ 


9/170 

24 /y 


G7 

y / 


T">1A 

2220 


«iQ/i*7/£01O 


Homo sapiens 


ijiiniiar to ceiuaunn, gamma z 


919A 


yo 


2220 


gl503j4356 


Homo sapiens 


MMr/ protein 




07 

y / 


2221 


gil5866260 


Homo sapiens 


AF411132 1 MRIP2 


2479 


97 


2221 


gi29476839 


Homo sapiens 


Similar to centaunn, gamma 2 


2124 


yo 


2221 


gi30354556 


Homo sapiens 


MK1P2 protem 


2400 


y/ 


2222 


gi 15866260 


Homo sapiens 


A 17/11 1111 1 A>fDTDO 

AT411132 I MKlr2 


0/1*70 

24 /y 


y/ 


2222 


gi29476839 


Homo sapiens 


Similar to centaurin, gamma 2 


2124 


no 

yo j 


2222 ! 


gi30354556 


Homo sapiens 


MK1P2 protem 


2466 


yl 


2223 


gil841702 


Macaca fascicularis 


fertilin alpha-I isoform 


655 


01 

oi 


2223 


gi2632092 


Pongo pygmaeus 


fertilin alpha protein 


HA *Z 

/45 


y4 


2223 


gi2655944 


Papio anubis 


fertilin alpha-I 


1 

661 


Of 

o5 


2224 


gi!7887359 


Oryctolagus 
cuniculus 


lipophilin AL2 


248 


54 


2224 


gi4l07229 


Homo sapiens 


lipophilin A 


454 


1 aa 
1UU 


2224 


gi4 107231 


Homo sapiens 


lipophilin B 


1/C*7 

26/ 


£A 

60 


2225 


gi 180251 


Homo sapiens 


precerebellin 


183 


48 


2225 


gi6942096 


Mus musculus 


CBLN3 


472 


yo 


2225 


gi6942098 


Mus musculus 


AF218380 1 CBLN3 


472 


90 


2226 


gil8255724 


Mus musculus 


LOC2 15928 protein 


131 


28 


2226 


gi21750370 


Homo sapiens 


unnamed protein product 


917 


85 


2226 


gi28460663 


Rattus norvegicus 


Na+ dependent glucose 
transporter 1 


185 


30 


2227 


gi 18255724 


Mus musculus 


LOC2 15928 protein 


131 


28 


2227 


gi2 1750370 


Homo sapiens 


unnamed protein product 


917 


85 


2227 


gi28460663 


Rattus norvegicus 


Na+ dependent glucose 
transporter 1 


185 


30 


2228 


gi5726236 


multiple sclerosis 
associated retrovirus 
element 


gag polyprotein 


173 


53 


2228 


gi5726238 


multiple sclerosis 
associated retrovirus 
element 


AF123881_1 gag polyprotein 


1 £T1 

163 


57 


2228 


• nn T"> A £ A 

gi8272464 


Homo sapiens 


AM 5696 11 gag 


1 01 
iy 1 




2229 


gi 12964746 


Mus musculus 


AF31 66 12_1 neuronal 
pentraxin receptor 


2225 


88 


2229 


gi2253263 


Rattus norvegicus 


neuronal pentraxin receptor 


OKA 

225U 


OO 


2229 


gi4l60l97 


Homo sapiens 




25 5 y 


OO 

yy 


2230 


gi3170615 


Mus musculus 


DOC4 


1 con 
152U 


yj 


2230 


gi4760782 


Mus musculus 


Ten-m4 


i con 
152U 


oc 
yj 


2230 


gi9909617 


Gallus gallus 


teneurin-4 


1 i n 
1333 


OA 


2232 


gil4124993 


Homo sapiens 




232 


Ol 


2232 


gi30704639 


Mus musculus 


4930553F24Rik protein 


210 


74 


ion 

2232 


gi//l61UU 


Rattus norvegicus 


Ar22oyy5_i. selective lim 
binding factor 


91 ^ 


16 
/o 


2233 


gi20987535 


Mus musculus 


Mcoln2 protein 


804 


92 


2233 


gi24417793 


Mus musculus 


mucolipin 2 


804 


92 


2233 


gi244 17795 


Homo sapiens 


mucolipin 2 


857 


99 


2234 


gi20987535 


Mus musculus 


Mcoln2 protein 


804 


92 


2234 


gi24417793 


Mus musculus 


mucolipin 2 


804 


92 


2234 


gi24417795 


Homo sapiens 


mucolipin 2 


857 


99 


2235 


gi22477432 


Homo sapiens 


DKFZP762N23 16 protein 


1002 


100 


2235 


gi27370669 


Homo sapiens 


Similar to REl-silencing 


159 


36 
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— — 


transcription factor 






ZZJj 


gl4UJUZU 


Mus museums 


En-z/lacZ iusion protein 


330 


92 


ZZio 


gii lyyoizo 


Camelus 
dromedarius 


chymosin 


294 


83 


ZZ.30 


gi4y lyoz 


synthetic construct 


; — - 

preprochymosin 


29 1 


83 


771Q 


gl/UUol/ZJ 


Callithrix jacchus 


prochymosin 


314 


91 




giz/ jjoyo4 


Homo sapiens 


extracellular suliatase JSULr-2 


560 


100 




giz/ jooyjo 


Mus musculus 


extracellular suliatase oULr-2 


499 


f\r\ 
90 


zz^y 


gizy I0jo4j 


Mus musculus 


extracellular suliatase oU.Lr-1 


375 


70 


2240 


gi27124671 


Homo sapiens 


Zn-carboxypeptidase 


877 


96 


2240 


gi2960072 


Homo sapiens 


procarboxypeptidase B 


488 


55 


2240 


gi32880163 


Homo sapiens 




487 


55 


2241 


gi27 124671 


Homo sapiens 


Zn-carboxypeptidase 


877 


96 


2241 


gi2960072 


Homo sapiens 


procarboxypeptidase B 


488 


55 


2241 


gi32880163 


Homo sapiens 




487 


55 


2242 


gil 1545705 


Homo sapiens 


ISCU1 


663 


99 


2242 


gi 1 1545707 


Homo sapiens 


ISCU2 


845 


100 


2242 


gi20381021 


Mus musculus 


Nifu-pending protein 


807 


96 | 


2243 


gil 75 12406 


Mus musculus 


differential display and 
activated by p53 


188 


52 


2243 


gi25 166615 


TT 

Homo sapiens 


A I*v\ ^ AAA ■! lT-\. Y""V A 1 * T 

AF223000M DDA3-hke 
protein 


427 


56 


2243 


gt25 166621 


Homo sapiens 


AF322891_1 DDA3-hke 
protein 


427 


56 


2244 


gi 15 990480 


Homo sapiens 


-binding protein 2 


1200 


99 


ZZ44 


glZlyolZl/ 


Homo sapiens 


-binding protein 2 


1200 


99 


ZZ44 


gi222 13050 


Mus musculus 


B2303 13N05Rik protein 


1189 


97 


2Z4D 


gi204058 


Rattus norvegicus 


extracellular signal-related 
kinase 3 


1497 


62 


ZZ4j 


^vj*^ aa^ 

gizjyui 


Homo sapiens 


63kDa protein kinase 


2886 


98 


ZZ4D 


giZ/oozlZJ 


Danio rerio 
— — — : : 


Similar to mitogen-activated 
protein kinase 4 


1670 


61 


ZZhO 




Homo sapiens 


nesprin-2 


354 


100 


70A£ 


gizoiyjo/y 


Homo sapiens 


nesprin-2 alpha 2 


354 


100 


ZZHD 


gizo lyjoo l 


Homo sapiens 


nesprin-2 beta 2 


354 


i f\r\ 

100 






Mus musculus 


Clq-like 


560 


80 


2248 


gi26996600 


Mus musculus 


Similar to Clq-like 


692 


96 


ZZ4o 


gl3Z4Ul ZZ/ 


Homo sapiens 


AF5253l5_l Clq-domain 
containing protein 


711 


99 


2249 


gil4718648 


Homo sapiens 


allantoicase 


967 


99 


zz4y 


gl209o/Ooy 


Homo sapiens 


Similar to allantoicase 


1162 


99 


2249 


gi9255889 


\ r | 

Mus musculus 


AF278712 1 allantoicase 


932 


78 


2250 


gil5617341 


Homo sapiens 


LAG-3 protein precursor 


2796 


99 


2250 


'OrtOf 1 inn 

gi3085H87 


Homo sapiens 


LAG3 protein 


1906 


99 


2250 




nuiiiu sdpicns 


lympnocyie protein 


Z0.54 


no 

ye 


2251 


gil3810285 


Rattus norvegicus 


guanine nucleotide 
release/exchange factor 


5807 


91 


2251 


gi2522208 


Homo sapiens 


Ras-GRF2 


6407 


99 


2251 


gi5882290 


Homo sapiens 


Ras guanine nucleotide 
exchange factor 2 


6401 


99 


2252 


gi22038159 


Homo sapiens 


AF527605 I ziziminl 


7984 


100 


2252 


gi28374168 


Mus musculus 


AA959601 protein 


7520 


93 


2252 


gi31419757 


Mus musculus 


AA959601 protein 


7520 


93 


2253 


gi 10433672 


Homo sapiens 


unnamed protein product 


1325 


89 
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2253 


gu 9263505 


Homo sapiens 


hypothetical protein FIJI 2242 


1325 


89 


2253 


gi23272394 


Homo sapiens 


KCTD2 protein 


728 


67 


2254 


gil4041697 


Homo sapiens 




3330 


94 


2254 


gi2 1594273 


Homo sapiens 




3371 


95 


2254 


gi25303955 


Homo sapiens 




3371 


95 


2255 


gil438532 


Rattus norvegicus 


rAl 


393 


51 i 


2255 


gil438534 


Rattus norvegicus 


rA9 


857 


70 ! 


2255 


gi9438033 


Homo sapiens 


AF2544ll_l ser/arg-rich pre- 
mRNA splicing factor SR-Al 


386 


51 


2256 


gi!438532 


Rattus norvegicus 


rAl 


393 


51 


2256 


gii438534 


Rattus norvegicus 


rA9 


857 


70 


2256 


gi9438033 


Homo sapiens 


AF2544ll_l ser/arg-rich pre- 
mRNA splicing factor SR-Al 


386 


51 


2257 


gil872200 


Homo sapiens 


alternatively spliced product 
using exon 13A 


242 


58 


2257 


gi3002527 


Homo sapiens 


neuronal thread protein AD7c- 
NTP 


283 


59 


2257 


gi32486167 


Homo sapiens 


AD7C-NTP 


283 


59 


2258 


gil2652851 


Homo sapiens 


AAH00178 potassium channel 
modulatory factor 


1987 


100 


2258\ 


gi26453336 


Homo sapiens 


FIGCl 


1983 


99 


2258 


gi7677058 


Homo sapiens 


AFl55652_l potassium 
channel modulatory factor 


1983 


99 


2259 


gi27695389 


Mus musculus 


MGC58017 protein 


1050 


97 


2259 


gi28558964 


Human herpesvirus 4 
type 2 


nuclear antigen-3B 


138 


28 


2259 


gi3 048 1648 


Homo sapiens 




660 


55 


ZZok) 


■ 1111 A^O A 

gil 1 119239 


Rattus norvegicus 


AF313453_1 synaptotagmm 13 


792 


86 


zzou 


gi 14210274 


Rattus norvegicus 


AF375466_1 synaptotagmm 13 


792 


86 


2260 


gi21410154 


Mus musculus 


synaptotagmin 13 


779 


84 


2261 


gil 1342591 


Mus musculus 


RanBP7/importin 7 


5301 


97 I 


2261 


gi32330683 


Mus musculus 


importin 7 


5313 


97 


2261 


gi3800881 


Homo sapiens 


RanBP7/importin 7 


5333 


98 


2262 


gil 7939650 


Homo sapiens 


AAH19302 hypothetical 
protein FLJ12525 


3660 


97 


2262 


gil 8676522 


Homo sapiens 


FLJ00 158 protein 


1599 


100 


2262 


gi27462078 


Homo sapiens 


AF1 16730 1 MSTP060 


3629 


94 


2263 


gi28981429 


Mus musculus 


Ddefl protein 


879 


94 


2263 


gi4063614 


Mus musculus 


ADP-ribosylation factor- 
directed GTPase activating 
protein isoform a 


879 


94 


2263 


gi4406393 


Bos taurus 


differentiation enhancing factor 
1 


876 


94 


2264 


gi59500 


Human herpesvirus 1 


RL2 


139 


37 


/Z04 


gijyj j / 


Human herpesvirus I 


immediate early protein 


139 


37 


2264 


gi59833 


Human herpesvirus 1 


IE110 


139 


37 


2265 


gil3872813 


Homo sapiens 


fibuiin-6 


513 


29 


2265 


gi 14575679 


Homo sapiens 


AF156100 1 hemicentin 


513 


29 


2265 


gi9280405 


Homo sapiens 


AF245505 1 adlican 


1462 


46 


2266 


gil5 145797 


Sus scrofa 


basic proline-rich protein 


178 


25 


2266 


gi27348769 


Bradyrhizobium 
japonicum USDA 
110 


blr0521 


191 


29 


2266 


gi30844278 


Cercopithecine 


very large tegument protein 


178 


25 
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T#i art 4~i 4*17 

laeniiiy 






li f^rn pc vi rtic 1 
11CI pCo V 11 Uo 1 








2267 


gi2 1748983 


T-Tnmn cantpnc 
iiuiiiu aapidia 


UIlilalllCLl piUlCUl plUUUV/l 


198 
lZo 


OJ 


2267 


pi522145 


J-ir\m t~% cnniPnC 

riuiiiu oaUiCiiD 


D-CC11 J*iv/WLil laciui 


I 9Q 


71 


2268 


pi 2 1748983 


nuuiu bd.pi ci lb 


uiiiiaincu pruiciu pruuuet 


1 9R 
lZo 


Oj 


2268 




riAJIllV oaUlvilo 


R_r»**11 rrrnwtVi fnptfxr 
X5-CC11 grow 111 lav LUX 


1 90 

izy 


71 
/l 


2269 


gi 13529248 


Homo sapiens 


Centrin 3 


842 


100 


996Q 




nomo Sapiens 




o4Z 


1 f\(\ 




UJ On OU 1 


oyninenc construct 




QA9 
04Z 


1 c\c\ 


9970 


5IJ InJJZJD 


nomo Sapiens 


liYLfVvjiij j L\jj i f protein 


99^0 
ZZjy 


01 


2270 


pi39aQ9Q07 


nuiuu oa.pi Clio 


selenoprotein kj 


99^0 

zzjy 


Q1 


9970 


01^^799^0 


nomo Sapiens 




1 9/SC 


00 
yo 


9971 


ot"*1zKS9<?6 
glJ l*f JJZjO 


nomo sapiens 


LwUxsjEiD j iuj i / protein 


99<Q 

zzdv 


yi 


2271 


gi32492907 


Homo sapiens 


selenoprotein 0 


2259 


91 


9971 
ZZ / 1 


glOD /ZZJU 


Homo sapiens 




1768 


no 
98 


2272 


gi21928729 


Homo sapiens 


seven transmembrane helix 
receptor 


661 


99 


2272 


gi6693701 


Homo sapiens 


AF147788_1 melanopsin 


661 


99 


ZZ/Z 


giooyj /Ui 


Mus musculus 


Af l 147789_l melanopsm 


529 


83 


2273 


gi20072741 


Mus musculus 


E430025L02Rik protein 


538 


81 


ZZ / J 


giZlU4<Oo 


Rattus norvegicus 


platelet glycoprotein V 


143 


41 


2273 


gi439296 


Homo sapiens 


garp 


166 


43 


2274 


gi!5487302 


Homo sapiens 


medium-chain acyl-CoA 
synthetase 


727 


97 


2274 


gi 15706421 


Homo sapiens 


middle-chain acyl-CoA 
synthetase 1 


727 


97 


2274 


gi50 19275 


Bos taurus 


xenobiotic/medium-chain fatty 
acid:CbA ligase form XL-III 


529 


70 


2275 


gil5077826 


Homo sapiens 


AF394782_1 rap guanine 
nucleotide exchange factor 


2149 


100 


2275 


gi20386206 


Homo sapiens 


AF478567J PDZ domain- 
containing guanine nucleotide 
exchange factor PDZ-GEF2 


2149 


100 


2275 


gi6650766 


Homo sapiens 


AFl 17947J PDZ domain- 
containing guanine nucleotide 
exchange factor 1 


2149 


100 


2276 


' 1 CAT70OZT 

gu 5 077826 


Homo sapiens 
— -s : 


AF394782_1 rap guanine 
nucleotide exchange factor 


2149 


100 


ZZ /O 


glZUJoOZUo 


Homo sapiens 


AF478567_l PDZ domain- 
containing guanine nucleotide 
exenange tactor rL/ZrOcrz 


2149 


100 


9976 


glOOJU /OO 


nomo sapiens 


A 1711 '70/49 1 Dr47 i 

Ar 1 1 ly** /_i rJJZi aomain- 
containing guanine nucieouae 
eA.cnangc iacior i 


zi4y 


1 Aft 

1UU 


2277 


gil3592175 


Leishmania major 


AC084329_l ppg3 


165 


29 


2277 


gi5420387 


Leishmania major 


proteophosphoglycan 


163 


26 


2277 


gi5420389 


Leishmania major 


proteophosphoglycan 


151 


30 


2278 


gi 18676788 


Homo sapiens 


unnamed protein product 


875 


88 


2278 


gi21779866 


Mus musculus 


AF458068 l IL-17RE 


234 


38 


2278 


gi21779869 


Homo sapiens 


AF458069 l IL-17RE 


875 


88 


2279 


gi 18676788 


Homo sapiens 


unnamed protein product 


875 


88 


2279 


gi21779866 


Mus musculus 


AF458068 l IL-17RE 


234 


38 


2279 


gi21779869 


Homo sapiens 


AF458069 l IL-17RE 


875 


88 


2280 


gil4150450 


Rattus norvegicus 


AF241241J UDP- 
GalN Ac: polypeptide N- 


197 


85 
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acetylgalactosaminyltransferase 
T9 






2280 


gi25809274 


Homo sapiens 


polypeptide N- 

acetylgalactosaminyltransferase 
10 


219 


97 


2280 


gi28268676 


Homo sapiens 


UDP-N-acetyl-alpha-D- 
galactosamine:polypeptide N- 
acetylgalactosaminyltransferase 
10 


219 


97 


2281 


gil7384577 


Escherichia coli 


orfll76 


1087 


99 


2281 


gi28629348 


Escherichia coli 


SopA 


1087 


99 


2281 


gi42431 


Escherichia coli 




1087 


99 


2282 


gil377895 


Homo sapiens 


OB-cadherin-2 


540 


51 


2282 


gi30171995 


Homo sapiens 


cadherin-24 


990 


100 


2282 


gi30171998 


Homo sapiens 


cadherin-24 variant 


990 


100 


2283 


gil377895 


Homo sapiens 


OB-cadherin-2 


540 


51 


2283 


gi30171995 


Homo sapiens 


cadherin-24 


990 


100 


2283 


gi30171998 


Homo sapiens 


cadherin-24 variant 


990 


100 


2284 


gi 1398903 


Mus musculus 


Ca2+ dependent activator 
protein for secretion 


1303 


89 


2284 


gi21541504 


Homo sapiens 


AF458662J calcium- 
dependent activator protein for 
secretion protein 


1185 


83 


2284 


gi577428 


Rattus norvegicus 


Ca2+-dependent activator 
protein; calcium-dependent 
actin-binding protein 


1247 


85 


2285 


gil 1071729 


Homo sapiens 


putative dipeptidase 


526 


100 


2285 


gill 125344 


Homo sapiens 


putative metallopeptidase 


263 


58 


2285 


gi32490515 


Mus musculus 


putative membrane-bound 
dipeptidase-3 


245 


55 


2286 


gil 1493652 


Homo sapiens 


AF200708_1 calcium channel 
blocker resistance protein 
CCBR1 


2168 


100 


2286 


gi 13924720 


Homo sapiens 


AF252872_1 cystine/glutamate 
transporter xCT 


2168 


100 


2286 


gi 15082352 


Homo sapiens 


AAH 12087 member 11 


2168 


100 


2287 


gil 7028348 


Homo sapiens 


DKFZP586G1517 protein 


3748 


100 


2287 


gi20987924 


Mus musculus 


24 10004L15Rik protein 


3473 


92 


2287 


gi296 12455 


Mus musculus 


2410004L15Rik protein 


3819 


92 


2288 


gi 19352987 


Homo sapiens 


Similar to KIAA0433 protein 


6283 


97 


2288 


gi2887437 


Homo sapiens 


KJAA0433 


6416 


98 


2288 


gi31418648 


Mus musculus 




4916 


95 


2289 


gi24061707 


Mus musculus 


GAP-related interacting partner 
toE12 


766 


88 


2289 


gi26334941 


Mus musculus 


unnamed protein product 


TOO 

783 


OA 

89 


2289 


gi4240257 


Homo sapiens 


KIAA0884 protein 


725 


75 


2290 


gi20269957 


Sus scrofa 


AF498759_1 phospholipase C 
delta 4 


166 


96 


2290 


gi21307610 


Mus musculus 


phospholipase C delta 4 


158 


90 


2290 


gi571466 


Rattus norvegicus 


phospholipase C delta-4 


151 


84 


2291 


gil28397l7 


Mus musculus 


unnamed protein product 


238 


62 


2291 


gi 16552885 


Homo sapiens 


unnamed protein product 


382 


92 


2291 


gi26327387 


Mus musculus 


unnamed protein product 


238 


62 


2292 


gil8480186 


Mus musculus 


olfactory receptor MOR261-6 


1330 


81 
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2292 


gi32052343 


Mus musculus 


olfactory receptor 
GA_x6K02T2P3E9-4384160- 
4383228 


1330 


81 


2292 


gi9368991 


Homo sapiens 




1397 


99 


2293 


gi29791964 


Homo sapiens 


Thrombospondin 4 


2097 


100 


2293 


gi311626 


Homo sapiens 


thrombospondin-4 


2090 


99 


2293 


gi4895079 


Mus musculus 


thrombospondin 4 


2047 


96 


2294 


gi24460119 


Mus musculus 


AF32745M JNK-associated 
leucine-zipper protein 


6108 


95 


2294 


gi24460121 


Homo sapiens 


AF327452_1 JNK-associated 
leucine-zipper protein 


6282 


98 


2294 


gi3 116015 


Homo sapiens 


sperm specific protein 


3848 


100 


2295 


gi21654741 


Homo sapiens 


peptide/histidine transporter 


2861 


100 


2295 


gi2208839 


Rattus norvegicus 


peptide/histidine transporter 


2484 


87 


2295 


gi33126130 


Homo sapiens 


peptide/histidine transporter 


2826 


99 


2296 


gil9353264 


Homo sapiens 


Similar to dishevelled 
associated activator of 
morphogenesis 2 


193 


34 


2296 


gi2224703 


Homo sapiens 


KIAA0381 


291 


50 


2296 


gi30268369 


Homo sapiens 


hypothetical protein 


291 


50 


2297 


gi22760046 


Homo sapiens 


unnamed protein product 


918 


95 


2297 


gi27769120 


Homo sapiens 


Similar to hypothetical protein 
FLJ30921 


918 


95 


2297 


gi33417243 


Mus musculus 


B230312118Rik protein 


621 


62 


2298 


gil26559l3 


Homo sapiens 


AF227516 1 sprouty-4A 


494 


97 


2298 


gi 12655915 


Homo sapiens 


AF227517_1 sprouty-4C 


413 


100 


2298 


gi29747900 


Mus musculus 


Sprouty homolog 4 


347 


83 


2299 


gi29692498 


Mus musculus 


NAAG-peptidase II 


3438 


87 


2299 


gi32 11746 


Sus scrofa 


folylpoly-gamma-glutamate 
carboxypeptidase 


2813 


70 


2299 


gi4539525 


Homo sapiens 


NAALADase II protein 


3872 


99 


2300 


gi21750009 


Homo sapiens 


unnamed protein product 


501 


100 


2300 


gi23092685 


Drosophila 
melanogaster 


CG7020-PA 


150 


76 


2300 


gi235 12248 


Homo sapiens 


Similar to DISCO Interacting 
Protein 2 


238 


56 


2301 


gi21410507 


Mus musculus 


Plxnb2 protein 


465 


75 


2301 


gi6010211 


Homo sapiens 


semaphorin receptor 


225 


47 


2301 


gi9885259 


Homo sapiens 


AF149019_1 plexin-B3 


228 


47 


2302 


gi 11692802 


Homo sapiens 


AF320294 1 ABCG8 


287 


88 


2302 


gil5088540 


Homo sapiens 


AF324494 1 steroiin-2 


287 


88 


2302 


gil5146444 


Homo sapiens 


AF35 1824 J sterolin-2 


287 


88 


2303 


gil2652851 


Homo sapiens 


AAH00178 potassium channel 
modulatory factor 


1987 


100 


2303 


gi26453336 


Homo sapiens 


FIGC1 


1983 


99 


2303 f 


gi7677058 


Homo sapiens 


AF155652J. potassium 
channel modulatory factor 


1983 


99 


2305 


gi24430369 


Mus musculus 


MMAC8 


280 


47 


2305 


gi31338848 


Mus musculus 


MAIR-Ia 


285 


46 


2305 


gi31338850 


Mus musculus 


MAIR-Ib 


280 


47 


2306 


gi3 14 14326 


Homo sapiens 


MHC class I antigen 


1941 


99 


2306 


gi33187148 


Homo sapiens 


HLA-A2 


1941 


99 


2306 


gi403144 


Homo sapiens 


MHC class I lymphocyte 
antigen 


1941 


99 
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2307 


gi21667214 


Homo sapiens 


AF465767J 
bactericidal/permeability- 
increasing protein-like 3 


743 


90 


2307 


gi32490539 


Homo sapiens 


RY2G5 


191 


29 


2307 


gi57732 


Rattus rattus 


potential ligand-binding 
protein 


231 


32 


2308 


gi21667214 


Homo sapiens 


AF465767J 

bactericidal/permeability- 
increasing protein-like 3 


743 


90 


2308 


gi32490539 


Homo sapiens 


RY2G5 


191 


29 


2308 


gi57732 


Rattus rattus 


potential ligand-binding 
protein 


231 


32 


2309 


gi21667214 


Homo sapiens 


AF465767_1 
bactericidal/permeability- 
increasing protein-like 3 


743 


90 


2309 


gi32490539 


Homo sapiens 


RY2G5 


191 


29 


2309 


gi57732 


Rattus rattus 


potential ligand-binding 
protein 


231 


32 


2310 


gi21667214 


Homo sapiens 


AF465767_1 

bactericidal/permeability- 
increasing protein-like 3 


743 


90 


2310 


gi32490539 


Homo sapiens 


RY2G5 


191 


29 


2310 


gi57732 


Rattus rattus 


potential ligand-binding 
protein 


231 


32 


2311 


gil3529158 


Homo sapiens 


AAH05349 


1137 


99 


2311 


gi529514 


Sus scrofa 


neuronal endocrine protein 


1073 


94 


2311 


gi77 18079 


Homo sapiens 


neuroendocrine protein 7B2 


1129 


99 


2312 


gi!5029903 


Mus musculus 


Similar to proline-rich protein 
BstNI subfamily 2 


175 


31 


2312 


gi3 1746553 


Caenorhabditis 
elegans 


Collagen protem 51 


171 


35 


2312 


gi32698037 


Caenorhabditis 
elegans 




174 


33 


2313 


gi 13543081 


Mus musculus 


claudin 6 


822 


70 


2313 


gl41 28041 


Homo sapiens 


claudin-9 protein 


11 16 


100 


2313 


gi432529o 


Mus musculus 


claudin-9 


1078 


95 


2314 


gi 1867663 8 


Homo sapiens 


l?r IAAt t o < • 

FLJ002 18 protem 


574 


95 


2314 


gi4587895 


Rattus norvegicus 


AF072509_1 glutamate 
receptor interacting protein 2 


667 


84 


T> 1 A 

2314 


■ s* /~ r\ i r r c 
gi6601555 


Rattus norvegicus 


glutamate receptor interacting 
protein 2 


667 


84 


2315 


gi23496442 


Rattus norvegicus 


disabled- 1 


2807 


96 


2315 


■ noon CO 

gi3288852 


Homo sapiens 


disabled- 1 


2865 


99 


2315 


gi8 118615 


Homo sapiens 


AF263547_l disabled- 1 


2842 


99 


/jIO 


glloo / /*oo 


Homo sapiens 


a Aui zqia 






2316 


gi208 10324 


Homo sapiens 




493 


100 


2316 


gi26351033 


Mus musculus 


unnamed protein product 


444 


91 


2317 


gil5430703 


Homo sapiens 


AF362953_1 testis specific 
serine/threonine kinase 2 


1854 


99 


2317 


gi2738898 


Mus musculus 


protein kinase 


1684 


89 


2317 


gi33590489 


Rattus norvegicus 


serine/threonine kinase 22B 


1755 


92 


2318 


gil2963879 


Homo sapiens 


prostaglandin D synthase 


998 


100 


2318 


gil3543568 


Homo sapiens 


PTGDS protein 


998 


100 


2318 


gi!89772 


Homo sapiens 


prostaglandin D2 synthase 


998 


100 
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ai 1 a 
2319 


gi 143367 18 




Homo sapiens 


Ai}UUo4o4_lo similar to 

xlALrrl 


r c ez 

656 


99 




gi 14336766 


Homo sapiens 
__ 


a eoo/^io c 

nyaroxyacyigiuiatmone 
iiyuiuiabc 




An 
4/ 




glZUVooooj 


Mus musculus 


9R1 001 AT91T?i1r r»rnt#»in 

zo luu iHizjxviK protein 


Da j 


no. 
to 


15 ZU 


rril 110*7521^ 

gllJoy /OOJ 


Homo sapiens 


diuivAin nij lsoiorm d 


lZ*tJ 


Ofi 

yo 


ZJZU 




riomo sapiens 


lniesiine-speciiic annexm 


1ZDZ 


QR 
yo 


ZjZU 


gl/3 / /o4 


learns iaminaris 


allllCAlIl AJ.110 


1 ID I 


01 


ZoZl 


fri9fi/1990 
glZU*fZZZ 


xvauus norvegicus 


uada uansporicr proiein 


11 OA 
ZlZ^t 


QO 


2321 


gi21707908 


Homo sapiens 


, member 1 


2132 


99 


TJO 1 
ZJZ1 


glJlOJO 


Homo sapiens 


\jr/vr>/\ transporter 


01 1 *7 

ZH / 




2323 


gi20381266 


Homo sapiens 


Glypican 2 


602 


90 


2323 


gl440127 


Rattus norvegicus 


cerebroglycan 


54o 


Ol 
01 


2323 


gi59H318 


Homo sapiens 


AF105267_1 glypican-6 


265 


47 


2324 


• 1 e%£Z*l£ A *7A 

gi 18676470 


Homo sapiens 


FLJOO 132 protein 


1361 


1 AA 

100 


2324 


gil9344068 


Mus musculus 


2700038E08Rik protein 


2403 


74 


2324 


gi23274106 


Mus musculus 


270003 8E08Rik protein 


2403 


74 


2325 


gi25396387 


Homo sapiens 


alpha 2,6-sialyltransferase 


467 


98 


2325 


gi27650880 


Homo sapiens 


beta-galactoside alpha-2,6- 
sialyltransferase 


467 


98 


2325 


gi452751 


Gallus gallus 


Gal beta 1,4 GlcNAc alpha 2,6- 
sialyltransferase 


268 


58 


2326 


gi 13344995 


Homo sapiens 


Cat Eye Syndrome critical 
region protein isoform 1 


2004 


99 


2326 


gil3344997 


Homo sapiens 


Cat Eye Syndrome critical 
region protein isoform 2 


2001 


100 


2326 


gi27503696 


Homo sapiens 


Similar to cat eye syndrome 
chromosome region, candidate 
5 


2001 


100 


2327 


:ni/)/i a ac 

gi 13344995 


Homo sapiens 


Cat Eye Syndrome critical 
region protein isoform 1 


l A A A 

2004 


AA 

99 


1511 


1*11 /i /1 00*7 

gll 3344997 


— — 

Homo sapiens 


Cat Eye Syndrome critical 
region protein isoform 2 


ZUU1 


10U 


1511 


giz/jUooyo 


— — ; ; 

Homo sapiens 


Similar to cat eye syndrome 
enromosome region, canaiaaie 

c 

J 


ZUUl 


1 Aft 


ZjZo 


<ri909<;Q9 


ivduus norvegicus 


tvt t> r% 1 nit o _9 _tti o f*rr\ a 1 a r\i 1 1 1 n 

preaipnd-z-niacrogioouiin 




40 


919R 
ZJZo 


gio / 10O*f 




ovomacrogioouiin, ovoouiun 


910 

Z^)\i 


AO 


ZJZo 


cn£71 Ra"S 


Gallus gallus 


ovomacrogioouiin, ovobumn 


910 
Zjw 


AO 


919G 

Zjzy 


ot909^Q9 
glZUZjyz 


Rattus norvegicus 


preaipna-z-macrogioouiin 


91R 

Zjo 


AO 

*fU 


919Q 


<ri£71 R6A 
glU / 1 OOH 


VJallUS gallUS 


ovouiacrogioouun, ovoounin 


910 


AO 


919Q 

zjzy 


glO / 1603 


vjanus gauus 


ovomdcrogioDuun, ovosiaiin 


910 
ZjU 


AO 


ZjjU 


gizuzoyz 


Rattus norvegicus 


prealph a-2-macrogl obul in 


91C 
ZJO 


AO 




m671864 


Online (mil lie 


r»vntn si rrn cl rvVii 1 1 in nvoctsiHti 




40 

•TV 


2330 


gi671865 


Gallus gallus 


ovomacroglobulin, ovostatin 


230 


40 


2331 


gi202592 


Rattus norvegicus 


prealpha-2-macroglobulin 


238 


40 


2331 


gi671864 


Gallus gallus 


ovomacroglobulin, ovostatin 


230 


40 


2331 


gi671865 


Gallus gallus 


ovomacroglobulin, ovostatin 


230 


40 


2332 


gi202592 


Rattus norvegicus 


prealpha-2-macroglobulin 


238 


40 


2332 


gi671864 


Gallus gallus 


ovomacroglobulin, ovostatin 


230 


40 


2332 


gi671865 


Gallus gallus 


ovomacroglobulin, ovostatin 


230 


40 


2333 


gi 14789873 


Mus musculus 


Es31 protein 


508 


70 


2333 


gil7512361 


Mus musculus 


esterase 3 1 


508 


70 
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2333 


gi29476863 


Mus musculus 


Similar to esterase 31 


516 


69 


2334 


gi 19909 128 


Homo sapiens 


AF489528_1 transforming 
growth factor-beta binding 
protein- IS 


189 


100 


Z334 


giZU/Zoo 


: 

Rattus norvegicus 

. 


1 Or-beta masiang protein 
large subunit 


179 


90 


Z334 


gl33y34o 


Homo sapiens 


transforming growth factor- 
beta 1 binding protein precursor 


189 


100 


Z330 


gll Jool JO 


Ciailus gallus 


myomesin 


429 


37 


Z330 


rrtl 1/11 O 1 O 

gl3l4loZlZ 


Homo sapiens 


Myomesin 2 


439 


36 


Z330 


gi4u /uy / 


Homo sapiens 


lojicu protein 


439 


36 




gUZ0Dj44Z 


Homo sapiens 


keratin associated protein 4.2 


706 


86 


2339 


gil2655460 


Homo sapiens 


keratin associated protein 4.12 


732 


86 


Z33y 


* 1 n/TCC A CA 

gilzojj4o4 


Homo sapiens 


keratin associated protein 4.15 


761 


99 


Z34U 


gllzo55442 


Homo sapiens 


keratin associated protein 4.2 


706 


86 


Z34U 


gllZojj4oU 


Homo sapiens 


keratin associated protein 4.12 


732 


86 


2340 


gil2655464 


Homo sapiens 


keratin associated protein 4. 15 


761 


99 


2341 


gl 12655442 


Homo sapiens 


keratin associated protein 4.2 


706 


86 


2341 


gi 12655460 


Homo sapiens 


keratin associated protein 4.12 


732 


86 


2341 


gil2655464 


Homo sapiens 


keratin associated protein 4.15 


761 


99 | 


2342 


gil 5722084 


Homo sapiens 




1930 


99 


2342 


gi434306 


Homo sapiens 


lysosomal acid lipase; sterol 
esterase 


1288 


63 


2342 


gi50643 1 


Homo sapiens 


lysosomal acid lipase 


1288 


63 


2343 


gil5722084 


Homo sapiens 




1930 


99 


2343 


gi434306 


Homo sapiens 


lysosomal acid lipase; sterol 
esterase 


1288 


63 


2343 


gi506431 


Homo sapiens 


lysosomal acid lipase 


1288 


63 


A A 

2344 


gi20152322 


Homo sapiens 


putative G-protein coupled 
receptor 


1570 


100 


2344 


gi32526601 


Homo sapiens 


GPRC5D 


1576 


100 


2344 


gi81 18040 


Homo sapiens 


AF209923_1 orphan G-protein 
coupled receptor 


1570 


100 


2345 


gi 17224598 


Homo sapiens 


AF293615_1 blood dendntic 
cell antigen 2 protein 


1147 


95 


2345 


gil7225337 


Homo sapiens 


AF325459 1 dendritic lectin 


1147 


95 


2345 


gil 7225339 


Homo sapiens 


AF325460_1 dendritic lectin b 
isoform 


953 


82 


2346 


gll 7224598 


Homo sapiens 


a rwi/\1/'1f ill fi 1*.' 

AF293615_1 blood dendntic 
cell antigen 2 protein 


1147 


95 


2346 


gil7225337 


Homo sapiens 


AF325459 1 dendritic lectin 


1147 


95 


2346 


gll 7225339 


Homo sapiens 


AF325460_1 dendntic lectin b 
isoform 


953 


82 


2347 


gi2l929H9 


Homo sapiens 


seven transmembrane helix 
recepior 


1588 


100 


2347 


gi27920l6 


Homo sapiens 


olfactory receptor 


1393 


100 


2347 


gi40928l9 


Homo sapiens 


BC319430 5 


1386 


100 


2348 


gi2589l72 


Rattus norvegicus 


mucin Muc3 


308 


36 


2348 


gi28436742 


Mus musculus 


Muc3 protein 


295 


37 


2348 


gi59lll69 


Homo sapiens 


AF147790_1 transmembrane 
mucin 12 


719 


81 


2349 


gi3549l52 


Homo sapiens 


R29124 1 


180 


36 


2349 


gi8!0l840 


Papio hamadryas 


AF259559J 

carcinoembryonic antigen- 


182 


35 
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family cell adhesion molecule 
w; CEACAMw 






2349 


gi8101856 


Cercopithecus 
aethiops 


AF259567J 

carcinoembryonic antigen- 
family cell adhesion molecule 
1-1; CEACAM1 


179 


33 


2350 


gi27924102 


Mus musculus 


23 10075M1 5Rik protein 


944 


68 


2350 


gi29436830 


Mus musculus 


23 10075M1 5Rik protein 


944 


68 


2350 


gi6273399 


Homo sapiens 


AF200348_1 melanoma- 
associated antigen MG50 


940 


67 


2351 


gi27924102 


Mus musculus 


2310075M15Rik protein 


944 


68 


2351 


gi29436830 


Mus musculus 


2310075M15Rik protein 


944 


68 


2351 


gi6273399 


Homo sapiens 


AF200348_1 melanoma- 
associated antigen MG50 


940 


67 | 


2352 


gil0435776 


Homo sapiens 


unnamed protein product 


1132 


99 


2352 


gi32451585 


Homo sapiens 




681 


60 


2352 


gi7264653 


Mus musculus 


AF1 80470 1 Kiaa0575 


694 


62 


2353 


gi20219008 


Chlamydomonas 
reinhardtii 


AF394181J coiled-coil 
flagellar protein 


280 


29 


2353 


gi23497711 


Plasmodium 
falciparum 3D7 


AE014826_49 rhoptry protein, 
putative 


149 


25 


2353 


gi5457791 


Pyrococcus abyssi 


smcl chromosome segregation 
protein 


150 


22 


2354 


gi 126545 11 


Homo sapiens 


Torsin family 3, member A 


1438 


100 


2354 


gi 14043 167 


Homo sapiens 


Torsin family 3, member A 


1438 


100 


2354 


gil5079904 


Homo sapiens 


Torsin family 3, member A 


1438 


100 


2356 


gil5076843 


Homo sapiens 


AF233450_1 pecanex-like 
protein 1 


948 


72 


2356 


gi 18 157547 


Mus musculus 


AF237953_1 pecanex-like 3 


1325 


98 


2356 


gi6650377 


Mus musculus 


AF096286_1 pecanex 1 


948 


71 


2357 


gi 15076843 


Homo sapiens 


AF233450_1 pecanex-like 
protein 1 


948 


72 


2357 


gil 8 157547 


Mus musculus 


AF237953J pecanex-like 3 


1325 


98 


2357 


gi6650377 


Mus musculus 


AF096286_1 pecanex 1 


948 


71 


2358 


gil872200 


Homo sapiens 


alternatively spliced product 
using exon 13A 


298 


72 


2358 


gi2580578 
J 


Homo sapiens 


ubiquitous TPR motif; Y 
isoform 


301 


70 


2358 


gi8572229 


Homo sapiens 


ubiquitous TPR-motif protein 
Y isoform 


301 


70 


2359 


gil2043567 


Homo sapiens 


unc-93 related protein 


1544 


97 


2359 


gil 73909 1 5 


Mus musculus 


unc93 homolog B 


1350 


85 


2359 


gi23271746 


Mus musculus 


Unc93b protein 


1350 


85 


2360 


gi 15990461 


Homo sapiens 


AAH15612 ring finger protein 
25 


2465 


100 


2360 


gil8490513 


Mus musculus 


Rnf25 protein 


1983 


82 


2360 


gi29179411 


Mus musculus 


Ring finger protein 25 


1988 


82 


2361 


gil4714684 


Mus musculus 


2810423E13Rik protein 


632 


83 


2361 


gi33086578 


Rattus norvegicus 


Ab2-276 


385 


82 


2361 


gi7295255 


Drosophila 
melanogaster 


CG8596-PA 


307 


46 


2362 


gil6930383 


Pan troglodytes 


AF383169_1 leukocyte 
immunoglobulin-like receptor e 


172 


38 


2362 


gi32396010 


Bos taurus 


immunoglobulin A Fc receptor 


179 


33 
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2362 


gi6563042 


Homo sapiens 


AF109683_1 leukocyte- 
associated Ig-like receptor lb 


179 


24 


2363 


gil6930383 


Pan troglodytes 


AF383169_1 leukocyte 
immunoglobulin-like receptor e 


172 


38 


2363 


gi32396010 


Bos taurus 


immunoglobulin A Fc receptor 


179 


33 


2363 


gi6563042 


Homo sapiens 


AF109683_1 leukocyte- 
associated Ig-like receptor lb 


179 


24 


2364 


gi21595190 


Mus musculus 


251000 lA17Rik protein 


366 


98 


2364 


gi21707128 


Homo sapiens 


Ran binding protein 1 1 


370 


100 


2364 


gi6650612 


Homo sapiens 


AF111109J. Ran binding 
protein 1 1 


370 


100 


2367 


gil 1493419 


Homo sapiens 


AF130117 15 PR01367 


128 


51 


2367 


gi6690223 


Homo sapiens 


AF090928 1 PRO0470 


118 


50 


2367 


gi6855613 


Homo sapiens 


AF1 13685 1 PRO0974 


154 


51 


2369 


gi3002527 


Homo sapiens 


neuronal thread protein AD7c- 
NTP 


404 


48 


2369 


gi32486167 


Homo sapiens 


AD7C-NTP 


404 


48 


2369 


gi6650810 


Homo sapiens 


AF118094_21 PRO1902 


258 


64 


2370 


gil3278391 


Mus musculus 


RIKEN cDNA 94300 15G10 


595 


71 


2370 


gi 14250646 


Homo sapiens 


FLJ20584 protein 


803 


98 


2370 


gi7020791 


Homo sapiens 


unnamed protein product 


834 


99 


2371 


gi 16588454 


Homo sapiens 


AF3 12374 1 AGTRAP protein 


823 


100 


2371 


gil6878260 


Homo sapiens 


AAH17328 Similar to 
angiotensin II, type I receptor- 
associated protein 


776 


95 


2371 


gi9621816 


Homo sapiens 


AF165187_l ATRAP 


822 


99 


2372 


gi 12330704 


Mus musculus 


AF333770_1 cell recognition 
molecule CASPR4 


539 


82 


2372 


gi 17986216 


Homo sapiens 


AF333769_1 cell recognition 
molecule CASPR3 


633 


97 


2372 


gi21961652 


Mus musculus 


contactin associated protein 4 


539 


82 


2373 


gi 12330704 


Mus musculus 


AF333770_1 cell recognition 
molecule CASPR4 


539 


82 


2373 


gil7986216 


Homo sapiens 


AF333769_1 cell recognition 
molecule CASPR3 


633 


97 


2373 


gi21961652 


Mus musculus 


contactin associated protein 4 


539 


82 


2374 


gil 1041469 


Macaca fascicularis 


UDP-OalNAc: polypeptide N- 
acetylgalactosaminyltransferase 


1116 


63 


2374 


gi21552746 


Homo sapiens 


AF410457_1 putative 
polypeptide N- 

acetylgalactosaminyltransferase 


1670 


100 


2374 


gi21552969 


Mus musculus 


AF467979_1 Williams-Beuren 
syndrome critical region gene 
17 


1656 


98 


2375 


gil6198335 


Drosophila 
melanogaster 


SD08329p 


411 


47 


2375 


gi23092707 


Drosophila 
melanogaster 


CG17090-PA 


411 


47 


2375 


gi23092708 


Drosophila 
melanogaster 


CG17090-PB 


411 


47 


2377 


gil4571502 


Homo sapiens 


calcium-promoted Ras 
inactivator 


1022 


81 


2377 


gil5680152 


Homo sapiens 


AAH14420 


317 


41 


2377 


gi4 185294 


Homo sapiens 


rasGAP-activating-like protein 


289 


36 


2379 


gil5128105 


Mus musculus 


AF397008 1 nephronectin 


737 


82 
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01 OO 

23 ly 


gil jUUZ^K) 


Mus musculus 


nepnronecnn snon isoionn 


OIO 


CO 

o2 


OIOO 

23 ly 


gl I j43UZ4o 


Mus musculus 


nepnronecnn long isoionn 


on 

/3 / 


QO 

o2 


oion 
23 oU 


gU0U4l0/D 


Homo sapiens 


AAili 0 /w* joined io j/vz«r 1 


0111 


oo 
yy 


oicn 
23 oU 


gll /cozyD4 


Drosophila 
melanogaster 




ill 


1 1 
31 


oicn 

23 BU 


gizoooy / ij 


Homo sapiens 


Q imi lor- f /-v tr\irtt*A 4rx TA 7T71 

oimiiar io joinea to j/\z*r i 


1£1 
303 


91 


01 Q1 

23ol 


rrtOGIQOI^ 


Xenopus laevis 




O/xl 
203 


oc 
25 


2381 


gi3242649 


Rana catesbeiana 


alpha 1 type I collagen 


297 


28 


01 0 1 

2381 


gl4i4UUZy 


Cynops pyrrhogaster 


alpha 1 type I collagen 


no 
2/ / 


2/ 


OIQO 

23 o2 


*-m*11Q<*701 1 

gl32yo /23 1 


Homo sapiens 


T A C A 1 


AQ1 

4ol 


100 


23 5Z 


gj3z9o723/ 


Homo sapiens 


TiCAl O 

I Ar A3.2 


/CIO 

619 


t Art 

100 


2382 


gi32967243 


Mus musculus 


TAFA3 


390 


82 


2383 


gi32967231 


Homo sapiens 


TAFA3 


481 


100 


2383 


gi32967237 


Homo sapiens 


TAFA3.2 


619 


100 


2383 


gi32967243 


Mus musculus 


TAFA3 


390 


82 


2384 


gi 10443967 


Homo sapiens 


AF2686lO_l THEG protein 


298 


60 


2384 


gi20306274 


Homo sapiens 


testicular haploid expressed 
gene 


298 


60 


2384 


gi7416134 


Homo sapiens 


testis-specific gene 


298 


60 


2385 


gil 8480746 


Mus musculus 


olfactory receptor MOR261-10 


1336 


80 


2385 


gi2 1928655 


Homo sapiens 


seven transmembrane helix 
receptor 


1427 


90 


2385 


gi32052225 


Mus musculus 


olfactory receptor 

GAjc6K02T2P3E9-434l246- 

4340281 


1336 


80 


2386 


gil 8480746 


Mus musculus 


olfactory receptor MOR261-10 


1336 


80 


2386 


gi21928655 


Homo sapiens 


seven transmembrane helix 
receptor 


1427 


90 


2386 


gi32052225 


Mus musculus 


olfactory receptor | 
GA_x6K02T2P3E9-434 1246- 
4340281 


1336 


80 


2387 


gi 13937888 


Homo sapiens 


AAH07052 Similar to 
heterogeneous nuclear 
ribonucleoprotein C 


196 


97 


2387 


gi337455 


Homo sapiens 


hnRNP C2 protein 


196 


97 


2387 


gi4139188 


Mus musculus 


heterogeneous nuclear 
ribonucleoprotein C1/C2; 
hnRNP C1/C2 ( 


190 


95 ! 


2388 


gi 190259 


Homo sapiens 


neuron-specific protein 


335 


100 


2388 


gil 90261 


Homo sapiens 


21 kDa protein 


335 


100 


2388 


gi56877 


Rattus norvegicus 


reading frame 1 


331 


98 


2389 


gil4573319 


Homo sapiens 


AF334755 1 interleukin-1 ! 
HY2 


818 


100 


2389 


gil4573321 


Homo sapiens 


AF334756_1 interieukin-1 
HY2 


818 


100 ; 


2389 


gil8025344 


Homo sapiens 


interleukin-1 receptor 
antagonist-like FIL1 theta 


804 


98 


2390 


gi27694303 


Homo sapiens 


Similar to keratin, hair, acidic, 
6 


694 


69 


2390 


gi3724099 


Homo sapiens 


type I hair keratin 1 


692 


69 


2390 


gi3724114 


Homo sapiens 


type I hair keratin 6 


694 


69 


2391 


gi32488718 


Oryza sativa 
(japonica cultivar- 
group) 


OSJNBa0088H09.19 


121 


41 
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2393 


gil45950l9 


Homo sapiens 


keratin 6 irs 


362 


98 


2393 


gi2790l522 


Homo sapiens 


keratin 6 irs3 


361 


98 


2393 


gi2790l524 


Homo sapiens 


keratin 6 irs4 


353 


95 


2394 


gill 066090 


Homo sapiens 


AF195192_1 matrix 
metalloprotease MMP-27 


507 


100 


2394 


gil2006364 


Tupaia belangeri 


AF281673_1 matrix 
metalloproteinase-27 


458 


91 


2394 


gi35lll49 


Gallus gallus 


matrix metalloproteinase 


353 


60 


2395 


gil 1066090 


Homo sapiens 


AF195192_1 matrix 
metalloprotease MMP-27 


507 


100 


.2395 


gil2006364 


Tupaia belangeri 


AF281673_1 matrix 
metalloproteinase-27 


458 


91 


2395 


gi3511149 


Gallus gallus 


matrix metalloproteinase 


353 


60 


2396 


gi247 10913 


Homo sapiens 


suppressor of fused 


2599 


100 


2396 


gi5739507 


Homo sapiens 


AF175770_1 suppressor of 
fused 


2594 


99 


2396 


gi6689894 


Homo sapiens 


AF159447_1 Suppressor of 
Fused 


2599 


100 


2397 


gi20387087 


Oncorhynchus 
mykiss 


like-2 


155 


32 


2397 


gi2 16672 12 


Homo sapiens 


AF465766_1 
bactericidal/permeability- 
increasing protein-like 2 


535 


100 


2397 


gi28 173296 


Cyprinus carpio 


bactericidal permeability- 
increasing 

protein/lipopolysaccharide- 
binding protein 


161 


36 


2398 


gil9526647 


Homo sapiens 


AF462348_l oxidored-nitro 
domain-containing protein 


2019 


99 


2398 


gi28 175624 


Mus musculus 


RIKEN cDNA 1810007P19 
gene 


1704 


86 


2398 


gi7303522 


Drosophila 
melanogaster 


CG13178-PA 


214 


29 


2399 


gil 9526647 


Homo sapiens 


AF462348_1 oxidored-nitro 
domain-containing protein 


2019 


99 


2399 


gi28 175624 


Mus musculus 


RIKEN cDNA 1810007P19 
gene 


1704 


86 


2399 


gi73 03522 


Drosophila 
melanogaster 


CG13178-PA 


214 


29 


A A AA 

2400 


gl2072977 


Homo sapiens 


putative pi 50 


151 


1 A A 

100 


O A AA 

2400 


gi339771 


Homo sapiens 


ORF2 


151 


i AA 

100 


2400 


*n AT7H 

gi339777 


Homo sapiens 


ORF2 contains a reverse 
transcriptase domain. 


I5l 


100 


2402 


gi 11493483 


Homo sapiens 


AF130117 48 PRO2550 


303 


64 


2402 


gi7020440 


Homo sapiens 


unnamed protein product 


310 


57 


Z4UZ 


gl/ / /Ul jy 


Homo sapiens 


aci loon 11 do r\t too 
Arliyyi/ U rKUl ill 


loy 


oil 


2404 


gi 1403325 


Homo sapiens 


MACH-beta-1 


122 


92 


2404 


gi 1403327 


Homo sapiens 


MACH-beta-2 


122 


92 


2405 


gi 1799570 


Rattus norvegicus 


TIP 120 


6200 


99 


2405 


gi29792160 


Homo sapiens 


TIP 120 protein 


62 13 


99 


2405 


gi7688703 


Homo sapiens 


AF157326J TIP120 protein 


6200 


99 


2406 


gil3016701 


Homo sapiens 


activating coreceptor NKp80 


1209 


97 


2406 


gi22449867 


Macaca fascicularis 


NKp80NK receptor 


1105 


87 


2406 


gi71 88567 


Homo sapiens 


AF175206J lectin-like 
receptor Fl 


1209 


97 
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o A no 

2408 


gizioiyiyo 


Homo sapiens 


-like lX-linkea 


233 


on 

80 


2408 


gl2 /o9.>4U7 


Mus musculus 


loilx protein 


233 


80 


2408 


gi30353941 


Homo sapiens 


TDI 1 V 

lBLlA protein 


Oil 

233 


80 


2409 


gi 1280461 3 


Homo sapiens 


A A TTA 1 TO O 


670 


82 


2409 


gil3279113 


Homo sapiens 


A A TTA A 1 O 1 

AAH04281 


670 


82 


2409 


gi 14043598 


Homo sapiens 


AAH07776 


670 


82 


2410 


gil28046l3 


Homo sapiens 


A A X1i\ 1 "700 

AAH01728 


670 


oo 

82 


2410 


gil3279113 


Homo sapiens 


AAH04281 


670 


82 


2410 


gi 14043598 


Homo sapiens 


AAH07776 


670 


82 


2411 


gil2804613 


Homo sapiens 


AAH01728 


670 


82 


2411 


gil3279113 


Homo sapiens 


AAH04281 


670 


82 


2411 


gi 14043598 


Homo sapiens 


AAH07776 


670 


82 


2412 


gi 13 182755 


Homo sapiens 


AF212237_1 HPHRP 


1816 


99 


2412 


gil5929309 


Homo sapiens 


Phosphotriesterase related 


1824 


100 


2412 


gi29791939 


Homo sapiens 


phosphotriesterase related 


1824 


100 


2414 


gi22539701 


Mus musculus 


4930506M07Rik protein 


2153 


93 


2414 


gi4778 


Saccharomyces 
cerevisiae 


Usol protein 


215 


23 


2414 


gi677198 


Saccharomyces 
cerevisiae 


putative 


217 


23 


2415 


gi27899969 


Homo sapiens 


unnamed protein product 


208 


66 


2415 


gi27900262 


Homo sapiens 


unnamed protein product 


208 


66 


2415 


gi6690248 


Homo sapiens 


AF090942 1 PRO0657 


192 


57 | 


2419 


gil3377880 


Cricetulus 
longicaudatus 


AF336043_1 arginineN- 
methyltransferase p82 isoform 


2585 


85 


2419 


gil3377882 


Cricetulus 
longicaudatus 


AF336044_1 arginineN- 
methyltransferase p77 isoform 


2534 


86 


2419 


gil3879453 


Mus musculus 


cDNA sequence BC006705 


2565 


87 


2420 


gi!6306618 


Homo sapiens 


AAH01482 phosphatidylserine 
decarboxylase 


1645 . 


99 


2420 


gil91185 


Cricetulus griseus 


phosphatidylserine 
decarboxylase 


1544 


93 


2420 


gi2737l042 


Xenopus laevis 


Similar to phosphatidylserine 
decarboxylase 


958 


57 


2421 


gi30041 


Homo sapiens 


COL2A1 


122 


28 


2421 


gi450394 


Homo sapiens 


alpha- 1 type II collagen 


122 


28 


2421 


gi930050 


Homo sapiens 




122 


28 


2422 


gi 13 874437 


Homo sapiens 


cerebral protein- 1 1 


159 


75 


2422 


gi20987344 


Mus musculus 


LOC2 12904 protein 


618 


69 


2422 


gi24980850 


Homo sapiens 




765 


100 


2423 


gi 13543940 


Homo sapiens 


Hypothetical protein 
DKFZp434B195 


2094 


99 


2423 


gil4035978 


Homo sapiens 


unnamed protein product 


2080 


98 


2423 


gil6923351 


Homo sapiens 


AF204270.1 RbBP-35 


1419 


98 


2424 


gi 18676660 


Homo sapiens 


FLJ00229 protein 


665 


99 


2424 


gi25955706 


Homo sapiens 


Similar to hypothetical protein 
MGC38041 


665 


99 


2424 


gi32484169 


Homo sapiens 




665 


99 


2425 


gi27549552 


Homo sapiens 


dipeptidyl peptidase JV-related 
protein-2 


410 


89 


2425 


gi29293087 


Homo sapiens 


dipeptidyl peptidase 9 


410 


89 


2425 


gi35 13303 


Homo sapiens 


R26984 I 


476 


100 


2426 


gi27549552 


Homo sapiens 


dipeptidyl peptidase IV-related 
protein-2 


410 


89 
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9496 

ZHZO 


<n9Q9Q0fl°,7 

gizyzyouo/ 


nomo Sapiens 


uipcpuuyi pcptiuasc y 


A 1 A 

410 


CO 

oy 


9496 
z*+zo 


glJj 1 J jUj 


nomo sapiens 




4/0 


1 AA 


9497 


ui9754Q559 
glZ / JH^JJZ 


HAm/\ cotYiAfio 

nomo sapiens 


uipcpuuyi pepiiuasc i v -reiaicu 

protein-2 


A 1 A 
4iU 


fiO 

oy 


9497 

ZH-Z / 


<T :oQOQ0 4 nR7 
gizyzzouo / 


nomo sapiens 


uipcpuuyi pepiiuasc y 


A 1 A 

41U 


oy | 


9497 


<xi05100AO 


Homo sapiens 


B96Q84 1 


4/0 


1 AA 
10U 


2428 


gil3097642 


Homo sapiens 


Ribosomal protein S25 


169 


100 


Z4zo 


• i 0 0*701 AQ 

giiiz/yi4y 


Homo sapiens 


KiDosoroal protein ozo 


1 /TA 

169 


100 


ZHZO 


' 1 1/l'l/C/lOO 

gll34Jo4zz 


Homo sapiens 


KiDosomai protein ozo 


169 


1 AA 

100 


2429 


gi21756739 


Homo sapiens 


unnamed protein product 


2539 


96 


z4zy 


giz3z70ozz 


Homo sapiens 




2427 


96 


O/lOA 

z4/y 


glo4o353o 


Homo sapiens 


: : 

hypothetical protein 


2061 


99 


2430 


gil2652695 


Homo sapiens 


AAH00096 HtrA-like serine 
protease 


1611 


92 


2430 


• c o "7/10 ^"C 

gi5 870865 


Homo sapiens 


serine protease 


1611 


92 


2430 


gi7672669 


Homo sapiens 


AF141305 1 . serine protease 
Htra2 


1611 


92 


2431 


gi24078514 


Mus musculus 


AF454954 1 crossveinless-2 


561 


95 


2431 


gi32816043 


Mus musculus 


BMP-binding endothelial 
regulator precursor protein 


561 


95 


2431 


gi32892146 


Homo sapiens 


crossveinless-2 


595 


100 


2432 


gil6502169 


Salmonella enterica 
subsp. enterica 
serovar Typhi 


putative DNA methylase 


756 


85 


O/llO 

z43z 


gi29 137981 


Salmonella enterica 
subsp. enterica 
serovar Typhi Ty2 


putative DNA methylase 


756 


85 


O/tlO 
Z4jZ 


gl4yo/oo 


Serratia marcescens 


Deoxyadenosyl- 
methyl transferase 


TOT 

337 


47 


0/100 
Z433 


' i CCY1A OC 1 


dallus gallus 


CALI1 


184 


44 


Z4.53 1 


~ 1 AAAQO/liC 

gll99Uo34o 


Gallus gallus 


chondrogenesis associated 
lipocalin 


137 


37 


Z433 


«»IOO AAA/^O O 

glZzUy0o3o 


Gallus gallus 


lipocalin-type prostaglandin D 
synthase 


137 


37 


0/10/1 

z434 


l *T 1 00*701 

gii / i3z/yi 


NostOC sp. FCC 7120 


asparaginyl-tRNA synthetase 


e-i f s- 

766 


44 


0/11/1 

2434 


gizzzyozOO 


1 nermosynecnococc 
us elongatus BP-1 


asparaginyl-tRNA synthetase 


767 


41 


9404 
Z4J4 


giJUZDyzoo 


Bacillus anthracis str. 
Ames 


. ; 

asparaginyl-tRNA synthetase 


•7*7/1 

774 


43 


940 <. 


gllZODjUOl 


Homo sapiens 


A A UA1 10A 


532 


oo 
00 


9405 


m 90 5747 QQ 
glZoD/4/00 


Macaca fascicularis 


succinate dehydrogenase 
flavoprotein subunit 


fin 

539 


OA 

89 


9415 


tri575Q170 
glJ / jyi / J 


— : 

Homo sapiens 


succinate dehydrogenase 
navoproiein suDunit 


coo 
03z 


oo 
oo 


2436 


gi21928188 


Mus musculus 


GPl-camma4* GPIeamma4 


853 


67 


2436 


gi29747988 


Mus musculus 


GPI-gamma 4 


853 


67 


2436 


gi30931171 


Mus musculus 


GPIgamma4 protein 


853 


67 


2437 


gi 150823 11 


Homo sapiens 


AAH12061 -binding protein 3 


631 


98 


2437 


gi27503479 


Mus musculus 


Pcbp3 protein 


631 


98 


2437 


gi9957165 


Homo sapiens 


AF176329J a!phaCP-3 


631 


98 


2438 


gi 16553246 


Homo sapiens 


unnamed protein product 


254 


98 


2438 


gi21739662 


Homo sapiens 


hypothetical protein 


218 


88 


2438 


gi21752375 


Homo sapiens 


unnamed protein product 


218 


88 


2439 


gil2804943 


Homo sapiens 


AAH01924 beta 


1660 


90 



WO 2004/080148 



PCT/US2003/030720 



282 
TABLE 2 B 



SEQJD 


Hit_ID 


Species 


Description 


S_score 


Percentage^ 
Identity 


2439 


gil89762 


Homo sapiens 


pyruvate dehydrogenase El- 
beta subunit 


1663 


91 


2439 


gil90792 


Homo sapiens 


pyruvate dehydrogenase El- 
beta subunit precursor 


1663 


91 


2440 


gil64851 


Oryctolagus 
cuniculus 


calsequestrin precursor 


1903 , 


92 


2440 


gi26 18621 


Mus musculus 


skeletal muscle calsequestrin 


1921 


93 


2440 


gi688292 


Homo sapiens 


calmitine; calsequestrine 


2012 


99 


2441 


gil 177622 


Saccharomyces 
cerevisiae 


AOF1001 


177 


30 


2441 


gi 13592175 


Leishmania major 


AC084329_1 ppg3 


193 


26 


2441 


gi28828184 


Dictyostelium 
discoideum 


similar to Leishmania major. 
Ppg3 


192 


26 


2442 


gi20380863 


Homo sapiens 


Similar to T cell receptor beta 
locus 


1364 


84 


2442 


gi307487 


Homo sapiens 


T-cell receptor beta 


1498 


93 


2442 


gi85 15902 


Homo sapiens 


T ceil receptor beta chain 


1300 


84 


2444 


gil4599484 


Homo sapiens 


AF333952_1 small proline-rich 
protein 2B 


453 


98 


2444 


gi3367693 


Homo sapiens 


small proline-rich protein 


458 


100 


2444 


gi385227 


Homo sapiens 


small proline-rich protein 2 


453 


98 


2445 


gil3876336 


Mus musculus 


protocadherin gamma A5 


4081 


84 


2445 


gi5456942 


Homo sapiens 


protocadherin gamma A5 


4744 


99 


2445 


gi5457072 


Homo sapiens 


AF152512_1 protocadherin 
gamma A5 short form protein 


4109 


100 


2447 


gi200962 


Mus musculus 


serine 1 ultra high sulfur 
protein 


262 


45 


2447 


gi200964 


Mus musculus 


serine 2 ultra high sulfur 
protein 


296 


49 


2447 


gi3228237 


Homo sapiens 


ultra high sulfer keratin 


2oi 


A Q 


2448 


gi 14764499 


Homo sapiens 


zinc linger protein 


QAQ 


00 


2448 


gi 1504006 


Homo sapiens 


similarto human ZFY protein. 


A A 1 

442 


JO 


2448 


gi28204954 


Mus musculus 


Similar to zinc finger protein 


771 


*7A 

/u 


2450 


gi 17223709 


Homo sapiens 


selenoprotein SelM 


235 


100 


2450 


gi 17223711 


Mus musculus 


selenoprotein SelM 


188 


78 


2450 


gi26351995 


Mus musculus 


unnamed protein product 


162 


76 


2451 


gi28848644 


Homo sapiens 


p02 protein 


181 


100 


2451 


gi30354510 


Homo sapiens 


TPT1 protein 


181 


100 


2451 


gi33285832 


Homo sapiens 


TCTP 


181 


100 


2452 


gil 3937829 


Homo sapiens 


AAH07016 


946 


100 


2452 


gil8606299 


Homo sapiens 




946 


ioo ! 


2452 


gi3360432 


Homo sapiens 


osteopontin 


946 


100 


2453 


gil4326586 


Homo sapiens 


AF386078_1 serine-cysteine 
proteinase inhibitor clade C 
member 1 


360 


92 


2453 


gi!79130 


Homo sapiens 


antithrombin III 


360 


92 


2453 


gil8490839 


Homo sapiens 


, member 1 


360 


92 


2454 


gi37231 


Homo sapiens 


DNA topoisomerase II 


8439 


99 


2454 


gi3869382 


Homo sapiens 


DNA topoisomerase II beta 


8299 


99 


2454 


gi790988 


Cricetulus 
longicaudatus 




8167 


96 


2455 


gil881713 


Rattus norvegicus 


fatty acid transport protein 


222 


84 


2455 


gi208 10561 


Mus musculus 


, member 1 


219 


82 


2455 


gi563829 


Mus musculus 


fatty acid transport protein 


219 


82 
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2456 


gil3277626 


Mus musculus 


homolog, subunit 7a 


247 


57 


2456 


gii5215085 


Mus musculus 


Cops7b protein 


428 ! 


98 


2456 


gi3309176 


Mus musculus 


COP9 complex subunit 7b 


428 


98 


2457 


gil80251 


Homo sapiens 


precerebellin 


183 


48 


2457 


gi6942096 


Mus musculus 


CBLN3 


472 


90 


2457 


gi6942098 


Mus musculus 


AF218380 1 CBLN3 


472 


90 


2458 


gil7861952 


Drosophila 
melanogaster 


LD01947p 


196 


55 


2458 


gi31432182 


Oryza sativa 
Qaponica cultivar- 
group) 


putative R1M2 protein 


158 


42 


2458 


gi7291183 


Drosophila 
melanogaster 


CG1826-PA 


196 


55 


2459 


gi20387087 


Oncorhynchus 
mykiss 


like-2 


155 


32 


2459 


gL21667212 


Homo sapiens 


AF465766J 

bactericidal/permeability- 
increasing protein-like 2 


535 


100 j 


2459 


gi28173296 


Cyprinus carpio 


bactericidal permeability- 
increasing 

protein/lipopolysaccharide- 
binding protein 


161 


36 


2460 


gi20387087 


Oncorhynchus 
mykiss 


like-2 


155 


32 


2460 


gi21667212 


Homo sapiens 


AF465766J 

bactericidal/permeability- 
increasing protein-like 2 


535 


100 


2460 


gi28 173296 


Cyprinus carpio 


bactericidal permeability- 
increasing 

protein/lipopolysaccharide- 
binding protein 


161 


36 


2461 


gi20387087 


Oncorhynchus 
mykiss 


like-2 


155 


32 


2461 


gi21667212 


Homo sapiens 


AF465766J 

bactericidal/permeability- 
increasing protein-like 2 


535 


100 


2461 


gi28 173296 


Cyprinus carpio 


bactericidal permeability- 
increasing 

protein/lipopolysaccharide- 
binding protein 


161 


36 


2462 


gil0435038 


Homo sapiens 


unnamed protein product 


1718 


96 


2462 


gil8257341 


Mus musculus 


Expressed sequence 
AW060207 


1044 


63 


2462 


gi24659229 


Homo sapiens 


hypothetical protein FU13150 


1727 


97 


2464 


gi27469556 


Homo sapiens 


Putative neuronal cell adhesion 
molecule 


180 


94 


2464 


gi4206390 


Homo sapiens 


putative neuronal cell adhesion 
molecule 


180 


94 


2465 


gil2667401 


Homo sapiens 


AF326731 1 NUF2R 


2336 


99 


2465 


gil4317902 


Homo sapiens 


kinetochore protein Nuf2 


2336 


99 


2465 


gi 18043223 


Mus musculus 


NUF2R protein 


1744 


72 


2466 


gi23321257 


Homo sapiens 


ezrin-binding partner PACE-1 


3482 


97 


2466 


gi24209887 


Homo sapiens 


ezrin-binding protein PACE-1 


3381 


90 


2466 


gi29144929 


Mus musculus 


Ezrin-binding partner PACE-1 


2738 


75 


2467 


gi21634823 


Homo sapiens 


AF389428J semaphorin6D 


1487 


97 
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isoform 3 






2467 


gi2 1634825 


Homo sapiens 


AF389429_1 semaphonnoD 
isoform 4 


1487 


97 


2467 


gi2 1634827 


Homo sapiens 


AF389430_1 semaphorin6D 
isoform 1 


1487 


97 


2468 


gil3543141 


Mus musculus 


Slc37a3 protein 


141 


52 


2469 


gi21671105 


Homo sapiens 


RAD52B 


511 


100 


2469 


gi23468352 


Homo sapiens 


Similar to RAD52B 


511 


100 


2469 


gi32967621 


Mus musculus 


24 10008M22Rik protein 


311 


66 


2470 


gi28626251 


Homo sapiens 


calcium-permeable store- 
operated channel TRPM3c 


289 


91 


2470 


gi28626253 


Homo sapiens 


calcium-permeable store- 
operated channel TRPM3d 


289 


91 


2470 


gi28626255 


Homo sapiens 


calcium-permeable store- 

, j 1_ 1 rr-if* Tin (TO 

operated channel TRPM3e 


289 


91 


2472 


gi20987880 


Mus musculus 


E130103I17Rik protein 


1605 


71 • 


2472 


gi28204917 


Mus musculus 


E130103I17Rik protein 


1594 


71 


2472 


gi4588087 


Homo sapiens 


AF095771_1 PTH-responsive 
osteosarcoma Bl protein 


1864 


89 


2473 


gil3591434 


Homo sapiens 




413 


74 


2473 


gil3591435 


Homo sapiens 




416 


87 


2473 


gil9913471 


Homo sapiens 




413 


74 


2474 . 


gi28372402 


Homo sapiens 


truncated transmembrane 
transport protein 


1271 


100 


2474 


gi3 1324239 


Homo sapiens 


proton-coupled amino acid 
transporter 


1263 


100 


2474 


gi31871291 


Homo sapiens 


proton/amino acid transporter 1 


1263 


100 


2475 


gi28372402 


Homo sapiens 


truncated transmembrane 
transport protein 


1271 


100 


2475 


gi3 1324239 


Homo sapiens 


proton-coupled amino acid 
transporter 


1263 


100 


2475 


gi31871291 


Homo sapiens 


proton/amino acid transporter I 


1263 


100 


2476 


gill 138040 


Homo sapiens 


rat myomegalin mRNA is 
reported in Acc# 
AF139185~similar to rat 
myomegalin 


828 


97 


2476 


gilll38042 


Homo sapiens 


rat myomegalin mRNA is 

reported in Acc# 

AF 1 39 1 85~similar to rat 

myomegalin 


1091 


93 


2476 


gil9263586 


Homo sapiens 


similar to rat myomegalin 


1085 


93 


2477 . 


gi 19263005 


Ciona intestinalis 


leucine-rich repeat dynein light 
chain 


367 


66 


2477 


gi2760161 


Anthocidaris 
crassispina 


outer arm dynein light chain 2 


338 


63 


2477 


gi7303901 


Drosophila 
melanogaster 


CG8800-PA 


265 


51 


2478 


gi 12666531 


Homo sapiens 


putative ^b-carotene-^jlO- 
dioxygenase 


917 


99 


2478 


gil4582265 


Homo sapiens 


AF276432_1 putative carotene 
dioxygenase 


930 


100 


2478 


gi27370671 


Homo sapiens 


Similar to beta-carotene 
dioxygenase 2 


930 


100 


2479 


gil2666531 


Homo sapiens 


putative bjb-carotene-^lO- 
dioxygenase 


917 


99 
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Identity 


2479 


gi 14582265 


Homo sapiens 


AF276432_1 putative carotene 


930 


100 








dioxygenase 






2479 


gi27370671 


Homo sapiens 


Similar to beta-carotene 


930 


too 








dioxygenase 2 






2480 


gil079734 


Mus musculus 


citron 


718 


97 


2480 


gi30088970 


Homo sapiens 


rho/rac-interacting citron 


696 


99 








kinase 






2480 


gi3599509 


Mus musculus 


rho/rac-interacting citron 


689 


97 ! 






kinase 






2481 


gi24980821 


Homo sapiens 


box polypeptide 26 


258 


100 


2481 


gi32485107 


Homo sapiens 


nexin-related serine protease 


731 


94 








inhibitor 






2481 


gi6062874 


Homo sapiens 


candidate tumor suppressor 


258 


100 








protein DICE1 






2482 


gil3383364 


Homo sapiens 


claudin-1 


1095 


99 


2482 


gil5214678 


Homo sapiens 


AAH12471 claudm I 


1095 


99 


2482 


gi7381083 


Homo sapiens 


A m *\ A A S t\ A 1 J* A 

AF134160_1 claudin-1 


1095 


99 


2483 


gi22902436 


Mus musculus 


Sphingosine- 1-phosphate 


616 


40 








phosphatase 1 






2483 


gi23345324 


Homo sapiens 


sphingosine 1-phosphate 


1513 


99 








phosphohydrolase 2 






2483 


gi29436890 


Mus musculus 


Similar to sphingosine-1- 


1406 


90 








phosphate phosphotase 2 






2484 


gi2072977 


Homo sapiens 


putative pi 50 


137 


79 


2484 


gi339771 


Homo sapiens 


ORF2 


137 


79 


2484 


gi339777 


Homo sapiens 


ORF2 contains a reverse 


137 


79 








transcriptase domain. 






2485 


gi2072977 


Homo sapiens 


putative pi 50 


13/ 


/y 


2485 


gi339771 


Homo sapiens 


ORF2 


137 


79 


2485 


gi339777 


Homo sapiens 


ORF2 contains a reverse 


137 


79 








transcriptase domain. 






2487 


gil8033185 


Danio rerio 


AF330001_1 UNC45-related 


1491 


79 








protein 








glZ/4ou4Z4 


. 

Mus musculus 


striated muscle UNC45 


1757 


Q5 


2487 


gi27436426 


Homo sapiens 


striated muscle UNC45 


1800 


98 


2488 


gi2680H68 


Gallus gallus 


condensin complex subunit 


1330 


44 


2488 


gi3851586 


Homo sapiens 


chromosome-associated 


1123 


63 








protein-C 






2488 


gi4092846 


Homo sapiens 


chromosome-associated 


1123 


63 








polypeptide-C 






2489 


gi2407911 


Homo sapiens 


C016 


1252 


99 


2489 


gi29437323 


Mus musculus 


Similar to cDNA for 


226 


40 








differentially expressed CO 16 












gene 






2489 


gi60 13073 


Mus musculus 


HemT-3 protein 


141 


27 


2490 


gi 13 157560 


Homo sapiens 




2246 


99 


2490 


gil8147612 


Homo sapiens 


metalloprotease disintegrin 


2246 


99 


2490 


gi21908030 


Homo sapiens 


a disintegrin and 


2230 


98 








metalloprotease domain 33 






2491 


gil5 145793 


Sus scrofa 


basic proline-rich protein 


186 


34 


2491 


gi3858883 


Acanthamoeba 


myosin I heavy chain kinase 


218 


37 






castellanii 








2491' 


gi4206769 


Acanthamoeba 


myosin I heavy chain kinase 


218 


37 






castellanii 








2492 


gil 136434 


Homo sapiens 


KIAA0187 


198 


72 
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2492 


gi21410151 


Mus musculus 


LOC213895 protein 


173 


62 


2492 


gi27696627 


Homo sapiens 


Ribosome biogenesis protein 
BMSlhomolog 


198 


72 


2493 


gil3559063 


Homo sapiens 




747 


100 


2493 


gi24416538 


Mus musculus 


1700001D09Rik protein 


631 


72 


2493 


gi9963863 


Homo sapiens 


AF226731 1 AD026 


688 


99 


2495 


gil56258 


Caenorhabditis 
elegans 


collagen 


139 


33 


2495 


gi21 105301 


Mytilus 

galloprovincialis 


AF448525J precollagen-P 


152 


28 


2495 


gi2388676 


Mytilus edulis 


precoliagen P 


148 


29 


2496 


gil56258 


Caenorhabditis 
elegans 


collagen 


139 


33 


2496 


gi21 105301 


Mytilus 

galloprovincialis 


AF448525_1 precollagen-P 


152 


28 


2496 


gi2388676 


Mytilus edulis 


precoliagen P 


148 


29 


2497 


gil56258 


Caenorhabditis 
elegans 


collagen 


139 


33 


2497 


gi21 105301 


Mytilus 

galloprovincialis 


AF448525_1 precollagen-P 


152 


28 


2497 


gi2388676 


Mytilus edulis 


precoliagen P 


148 


29 


2498 


gi20380052 


Homo sapiens 




372 


32 


2498 


gi203 80522 


Mus musculus 


Col3al protein 


368 


31 


2498 


gi29 144943 


Mus musculus 


Col3al protein 


368 


31 


2499 


gi 14035874 


Homo sapiens 


unnamed protein product 


1100 


99 


2499 


gil4035876 


Homo sapiens 


unnamed protein product 


1043 


99 


2499 


gi20070842 


Homo sapiens 


similar to hypothetical protein 
FIJI 3448 


1297 


99 


2501 


gi2072964 i 


Homo sapiens 


putative pi 50 


399 


81 


2501 


gi2072967 


Homo sapiens 


putative pi 50 


400 


81 


2501 


gi339777 


Homo sapiens 


ORF2 contains a reverse 
transcriptase domain. 


399 


81 


2502 


gi30040280 


Shigella flexneri 2a 
str. 2457T 


IS 103 orf 


731 


98 


2502 


'1 AA 4 1 \ *\ /"V 

gi3004H39 


Shigella flexneri 2a 
str. 2457T 


IS103 orf 


731 


98 


2502 


gi466695 


Escherichia coli 


ortA in IS 150 


731 


98 


2503 


gil2698037 


Homo sapiens 


KIAA1746 protein 


341 


100 


2503 


gi26344121 


Mus musculus 


unnamed protein product 


318 


92 


2503 


gi26351415 


Mus musculus 


unnamed protein product 


318 


92 


2504 


gi20269073 


Homo sapiens 


putative lipid kinase 


1035 


99 


2504 


'Ol ZTO A*\ Af\ 

gi2 1624340 


Homo sapiens 


cerarmde kinase 


1035 


99 


2504 


gi2 1624342 


Mus musculus 


ceramide kinases 


829 


81 


2505 


gi3 12584 


Mus musculus 


biliary glycoprotein 


165 


27 




glJiZJOO 


Mus musculus 


biliary glycoprotein 


IOj 


LI 


2505 


gi3 12590 


Mus musculus 


biliary glycoprotein 


174 


30 


2506 


gi312584 


Mus musculus 


biliary glycoprotein 


165 


27 


2506 


gi3 12586 


Mus musculus 


biliary glycoprotein 


165 


27 


2506 


gi3 12590 


Mus musculus 


biliary glycoprotein 


174 


30 


2507 


gil480744 


Equus caballus 


type II collagen 


346 


29 


2507 


gi30041 


Homo sapiens 


COL2A1 


344 


29 


2507 


gi450394 


Homo sapiens 


alpha- 1 type II collagen 


344 


29 


2508 


gil483580 


Rattus norvegicus 


NTR2 receptor 


911 


81 


2508 


gil8490912 


Homo sapiens 


neurotensin receptor 2 


1072 


95 
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2508 


gi3901028 


Homo sapiens 


neurotensin receptor 2 


1074 


95 


2509 


gil049104 


Homo sapiens 


dystonin isoform 1 


221 


100 


2509 


gii4530942 


Homo sapiens 


dystonin 2 


221 


100 


2509 


gil4530944 


Homo sapiens 


dystonin 2 


221 


100 


2510 


gil049104 


Homo sapiens 


dystonin isoform 1 


221 


100 


2510 


gil4530942 


Homo sapiens 


dystonin 2 


221 


100 


2510 


gi!4530944 


Homo sapiens 


dystonin 2 


221 


100 


2512 


gil572721 


Homo sapiens 


megakaryocyte stimulating 
factor; MSF 


203 


23 


2512 


gil6041156 


Macaca fascicularis 


X-ray radiation resistance 
associated 1 protein 


710 


66 


2512 


gil8676652 


Homo sapiens 


FLJ00225 protein 


761 


70 


2513 


gil572721 


Homo sapiens 


megakaryocyte stimulating 
factor; MSF 


203 


23 


2513 


gil6041156 


Macaca fascicularis 


X-ray radiation resistance 
associated 1 protein 


710 


66 


2513 


gil8676652 


Homo sapiens 


FU00225 protein 


761 


70 


2514 


gi26346328 


Mus musculus 


unnamed protein product 


965 


93 


2514 


gi33417011 


Mus musculus 




965 


93 


2514 


gi6330169 


Homo sapiens 


KIAA1164 protein 


1005 


99 


2515 


gi26346328 


Mus musculus 


unnamed protein product 


965 


93 


2515 


gi33417011 


Mus musculus 




965 


93 


2515 


gi6330169 


Homo sapiens 


KIAA1164 protein 


1005 


99 


2516 


gil2857668 


Mus musculus 


unnamed protein product 


123 


43 


2516 


gi26327823 


Mus musculus 


unnamed protein product 


123 


43 


2517 


gil7429038 


Ralstonia 
solanacearum 


PROBABLE ACYL-COA 
DEHYDROGENASE 
OXIDOREDUCTASE 
PROTEIN 


676 


61 


2517 


gi22776354 


Oceanobacillus 
iheyensis HTE831 


acyl-CoA dehydrogenase 


660 


63 


2517 


gi28280023 


Mus musculus 


5730439E10Rik protein 


974 


84 


2518 


gil7429038 


Ralstonia 
solanacearum 


PROBABLE ACYL-COA 
DEHYDROGENASE 
OXIDOREDUCTASE 
PROTEIN 


676 


61 


2518 


gi22776354 


Oceanobacillus 
iheyensis HTE831 


acyl-CoA dehydrogenase 


660 


63 


2518 


gi28280023 


Mus musculus 


5730439E10Rik protein 


974 


84 


2519 


gil9070124 


Mus musculus 


AF233346_1 zinc transporter- 
like 3 protein 


895 


95 


2519 


gi20563194 


Mus musculus 


AF395840_1 zinc transporter 6 


883 


93 


2519 


gi33338012 


Homo sapiens 


AF173387 1 MSTP103 


759 


94 


2520 


gi212451 


Gall us gallus 


nonmuscle myosin heavy chain 


182 


20 


2520 


gi212452 


Gallus gallus 


nonmuscle myosin heavy chain 


182 


20 


2520 


gi41 15748 


Bos taurus 


nonmuscle myosin heavy chain 
B 


182 


19 


2521 


gil8605758 


Mus musculus 


9030409GllRik protein 


1257 


94 


2521 


gi6526769 


Homo sapiens 


HRIHFB2003 


1200 


96 


2521 


gi7291408 


Drosophila 
melanogaster 


CG11206-PA 


263 


26 


2524 


gil3182757 


Homo sapiens 


AF212238 1 HTPAP 


843 


100 


2524 


gi21542541 


Homo sapiens 


Similar to HTPAP protein 


808 


100 


2524 


gi28381093 


Drosophila 


CG12746-PD 


410 


50 
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melanogaster 








2525 


gi 13 182757 


Homo sapiens 


AF2 12238 1 HlrAr 


OA 1 

843 


1 AA 


2525 


'rii c A1C A 1 

gi2 1542541 


Homo sapiens 


bum lar to H 1 rAr protein \ 


one 


1 AA 


2525 


gi2838l093 


Drosophila 
melanogaster 


CG12746-PD 


A 1 A 

410 




2527 


gil6416764 


Homo sapiens 


AF315594 1 FKSG16 


1027 


100 


2527 


gi 19353603 


Mus musculus 


D 1 1 Ertd 1 8e protein 


ill 

337 


A 1 

41 


2527 


gi3 1873637 


Homo sapiens 


hypothetical protein 


1A1 i| 

1014 


1 AA 
100 


2528 


gi 164 16764 


Homo sapiens 


A T"V1 1 C Cf\A 1 T7Ty O \ £. 

AF315594 I FKSGlo 


1 An 

102/ 


1 A A 
100 


2528 


gil9353603 


Mus musculus 


DllErtdl8e protein 


337 


41 


2528 


gi3 1873637 


Homo sapiens 


hypothetical protein 


1014 


100 


2529 


gi32330803 


Mus musculus 


podocan protein 


1095 


90 


2529 


gi32330805 


Homo sapiens 


podocan protein 


1205 


97 


2529 


gi3786312 


Homo sapiens 


extracellular matrix protein 


281 


33 


2530 


gi20258604 


Homo sapiens 


sialic acid binding Ig-like 
lectin 5 


2913 


99 


2530 


gi2411475 


Homo sapiens 


OB binding protein-2 


2913 


99 


2530 


gi9454520 


Homo sapiens 


AC018755 5 SIGLEC5 


2913 


99 


2531 


gi20258604 


Homo sapiens 


sialic acid binding Ig-like 
lectin 5 


2913 


99 


2531 


gi2411475 


Homo sapiens 


OB binding protein-2 


2913 


99 


2531 


gi9454520 


Homo sapiens 


AC018755 5 SIGLEC5 


2913 


99 


2532 


gil3183078 


Homo sapiens 


AF237652_l a disintegrin-like 
and metalloprotease domain 
with thrombospondin type I 
motifs-like 3 


602 


74 


2532 


gil5099921 


Homo sapiens 


A T^l l/'O 111 A T*\ A * K 'PC 

AFl763l3_l ADAM-TS 
related protein l 


0~1 A 

874 


AO 

98 


2532 


gi20987759 


Homo sapiens 


Similar to ADAMTS-like 1 


886 


yy 


2533 


gil78836 


Homo sapiens 


apolipoprotein C-II 


506 


100 


2533 


gi30582255 


Homo sapiens 


apolipoprotein C-II 


500 


99 


2533 


gi757915 


Homo sapiens 


apoCU protein 


506 


100 


2534 


giI78836 


Homo sapiens 


apolipoprotein C-II 


506 


1 c\n 

100 


2534 


gi30582255 


Homo sapiens 


apolipoprotein C-II 


500 


99 


2534 


gi757915 


Homo sapiens 


apoCII protein 


506 


100 


2536 


gil7389292 


Homo sapiens 


LDL induced EC protein 


914 


98 


2536 


gi5924319 


Homo sapiens 


AF184939J LDL induced EC 
protein 


914 


98 


2536 


gi85 18179 


Homo sapiens 


LDL induced endothelial cell 
protein 


941 


76 


2537 


gi28974490 


Homo sapiens 


lipoma HMGIC fusion-partner- 
like protein 


1071 


100 


2537 


gi30 102428 


Rattus norvegicus 


HMGIC fusion-partner-like 
protein 


1038 


95 


2537 


gi304 11045 


Mus musculus 


Similar to lipoma HMGIC 
fusion partner 


1037 


94 


2538 


gi 14603353 


Homo sapiens 


AAH10130 CGI-43 protein 


2362 


94 


2538 


gi23092946 


Drosophila 
melanogaster 


CG14980-PB 


537 


28 


2538 


gi4929555 


Homo sapiens 


AF151801J CGI-43 protein 


2219 


89 


2539 


gil2654633 


Homo sapiens 


Protein inhibitor of activated 
STAT3 


179 


84 


2539 


gil8606318 


Mus musculus 


Protein inhibitor of activated 
STAT 3, isoform 1 


179 


84 
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2539 


gi30582911 


Homo sapiens 


protein inhibitor of activated 
STAT3 


179 


84 


2540 


gi27449075 


Oreochromis 
mossambicus 


stearoyl-CoA desaturase 


743 


69 


2540 


gi29294686 


Homo sapiens 


SCD4 protein 


737 


100 


2540 


gi30350098 


Homo sapiens 


AF389338J acyl-CoA- 
desaturase 


1016 


100 


2541 


gil000867 


Homo sapiens 


DNA mismatch repair protein 


1931 


100 


2541 


gil000869 


Homo sapiens 


DNA mismatch repair protein 


1931 


100 


2541 


gil8204306 


Homo sapiens 


AAH21566 


1931 


100 


2542 


gill862941 


Mus musculus 


DDM36E 


430 


48 


2542 


gil9570398 


Homo sapiens 


hDDM36 


439 


49 


2542 


gi7650186 


Mus musculus 


AF176694_1 neighbor of Punc 
ell protein 


430 


48 


2543 


gi21744725 


Homo sapiens 


AF478693J glycosyl- 
phosphatidyl-inositol-MAM 


717 


97 


2543 


gi25005318 


Sus scrofa 


MAM domain containing 
glycosylphosphatidylinositol 
anchor 1 


672 


91 


2543 


gi25005320 


Sus scrofa 


glycosylphosphatidylinositol 
anchor 1 protein 


672 


91 


2544 


gil2276198 


Homo sapiens 


AF333487_1 FKSG40 


543 


96 


2544 


gi 12408250 


Homo sapiens 


FKSG28 


543 


96 


2544 


gi 18652934 


Xenopus laevis 


Mig30 


514 


48 


2545 


gil6769552 


Drosophila 
melanogaster 


LD38375p 


367 


51 


2545 


gi27696627 


Homo sapiens 


Ribosome biogenesis protein 
BMS1 homolog 


684 


93 


2545 


gi7294027 


Drosophila 
melanogaster 


CG7728-PA 


367 


51 


2546 


gi 12842044 


Mus musculus 


unnamed protein product 


375 


72 


2546 


gi 18921437 


Mus musculus 


2010004A03Rik protein 


375 


72 


2546 


gi20987450 


Homo sapiens 


LOC146433 


468 


91 


2547 


gil016012 


Rattus norvegicus 


neural cell adhesion protein 
BIG-2 precursor 


543 


93 


2547 


gi26891535 


Homo sapiens 


contactin 4 


570 


100 


2547 


gi29837411 


Homo sapiens 


BIG-2 


570 


100 


2548 


gi30102449 


Homo sapiens 


lipoma HMGIC fusion-partner- 
like protein 


822 


100 


2548 


gi30908798 


Homo sapiens 


lipoma HMGIC fusion partner- 
like protein 4 


676 


78 


2548 


gi30908800 


Rattus norvegicus 


lipoma HMGIC fusion partner- 
like protein 4 


675 


78 


2549 


gi 13097705 


Homo sapiens 


AAH03559 , member 3 


237 


52 


2549 


gil340142 


Homo sapiens 


alpha 1 -antichymotrypsin 


237 


52 


2549 


gi4 165890 


Homo sapiens 


alpha- 1 -antichymotrypsin 
precursor 


237 


52 


2550 


gil850850 


Murid herpesvirus 4 


serine threonine rich 
glycoprotein 


207 


33 


2550 


gi21618556 


Homo sapiens 




4040 


97 


2550 


gi33304372 


Homo sapiens 


tastin 


4035 


97 


2551 


gil2053849 


Homo sapiens 


DREV protein 


1649 


98 


2551 


gil2053851 


Homo sapiens 


DREV1 protein 


1633 


98 


2551 


gil2053853 


Homo sapiens 


DREV protein 


1649 


98 
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2553 


gill 990779 


Homo sapiens 




273 


50 


2553 


gi22760096 


Homo sapiens 


unnamed protein product 


538 


100 


2553 


gi28279813 


Homo sapiens 


Similar to hypothetical protein 
DKFZp434A171 


515 


97 


2554 


gill 125348 


Homo sapiens 


putative protein kinase 


2419 


99 


2554 


gi6933864 


Homo sapiens 


kinase deficient protein KDP 


2419 


99 


2554 


gi8272557 


Rattus norvegicus 


AF227741 1 protein kinase 
WNK1 


2340 


96 


2555 


gill 125348 


Homo sapiens 


putative protein kinase 


2419 


99 


2555 


gi6933864 


Homo sapiens 


kinase deficient protein KDP 


2419 


99 


2555 


gi8272557 


Rattus norvegicus 


AF227741 1 protein kinase 
WNK1 


2340 


96 


2556 


gi3599339 


Mus musculus 
domesticus 


ORF2 


138 


60 


2556 


gi3599342 


Mus musculus 
domesticus 


ORF2 


138 


60 


2556 


gi3599347 


Mus musculus 
domesticus 


ORF2 


138 


60 


2557 


gil5020809 


Takifugu rubripes 


putative methionyl tRNA 
synthetase 


674 


74 


2557 


gii7861592 


Drosophila 
melanogaster 


GH13807p 


567 


61 


2557 


gi23 171238 


Drosophila 
melanogaster 


CG31322-PA 


567 


61 


2558 


gil5341975 


Homo sapiens 


AAH13184 Similar to major 
histocompatibility complex, 
class II, DP beta 1 


432 


72 


2558 


gil7389919 


Homo sapiens 


AAH17967 Similar to major 
histocompatibility complex, 
class II, DP beta 1 


814 


100 


2558 


gil88479 


Homo sapiens 


HLA-DPBl 


432 


72 


2559 


gil5779083 


Homo sapiens 


AAH14609 


1122 


90 


2559 


gi3342737 


Homo sapiens 


R26660_2, partial CDS 


967 


86 


2559 


gi3478640 


Homo sapiens 


R26660 2, partial CDS 


138 


89 


2560 


gi!5779083 


Homo sapiens 


AAH14609 


1122 


90 


2560 


gi3342737 


Homo sapiens 


R26660_2, partial CDS 


967 


86 


2560 


gi3478640 


Homo sapiens 


R26660_2, partial CDS 


138 


89 


2561 


gil3991167 


Homo sapiens 


sialic acid-binding 
immunoglobulin-like lectin-like 
long splice variant 


661 


99 


2561 


gil4625822 


Homo sapiens 


AF282256 1 Siglec-Ll 


661 


99 


2561 


gi23272769 


Homo sapiens 


SIGLEC-like I 


661 


99 


2562 


gil5132186 


Homo sapiens 


unnamed protein product 


1122 


88 


2562 


gil5132529 


Homo sapiens 


unnamed protein product 


1122 


88 


2562 


gi21439502 


Homo sapiens 


unnamed protein product 


1122 


88 


2563 


gi202592 


Rattus norvegicus 


prealpha-2-macroglobulin 


238 


40 


2563 


gi671864 


Gallus gallus 


ovomacroglobulin, ovostatin 


230 


40 


2563 


gi671865 


Gallus gallus 


ovomacroglobulin, ovostatin • 


230 


40 


2564 


gi25990364 


Homo sapiens 


AF3 19622 1 P-glycoprotein 


191 


100 
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685 


BL00266 


Somatotropin, prolactin and related hormones 
proteins. 


BL00266A 15:6* « a47e,ll 35-61 


686 


PR00836 


SOMATOTROPIN HORMONE FAMILY 
SIGNATURE 


PR00836A 14.40 2.862e- 1 1 79-92 
PR00836B 16.59 7.000e-ll 101-119 


686 


BL00266 


Somatotropin, prolactin and related hormones 
proteins. 


BL00266B 24.48 8.714e-21 79-116 
BL00266A 15.69 1.923e-14 35-61 
BL00266D 12.72 4.000e-ll 201-224 
BL00266C 13.66 3.700e-10 135-151 


688 


PR00836 


SOMATOTROPIN HORMONE FAMILY 
SIGNATURE 


PR00836B 16.59 2.895e-16 101-119 
PR00836A 14.40 2.800e-13 79-92 


roo 
000 


BL00266 


Somatotropin, prolactin and related hormones 
proteins. 


BL00266B 24.48 4.000e-29 79-1 16 
BL00266A 15.69 9,000e-19 35-61 
BL00266D 12.72 4.000e-ll 201-224 
rJJLUUzooC U.oo 4.000e-10 135-151 






oerpins proteins. 


dL,UUZo4U Zo.jo j./0Ue-2o 185-226 

"RT nnoc/TE in k i ni** in ini ic\n 
DijjKjAo^xii iy.iD l.jjje-i / o /j-jy / 

BL00284A 15.64 8.714e-16 77-100 

P.T n09Sztr> 1£ 7 J. 7 770o 10 7Q/1 17A 

oxAJuzotx/ io.o*f /.z/ye-iz zy4ozu 
BL00284B 17.99 4.825e-10 158-178 ! 


690 


PR00390 


PHOSPHOT TP ASF C STCtNATTTRF 


PT?nn^onA i^ no 1 AiQt> ic\ 101 ono 
rx\.uuji/i//\ ij.uy i.*4oye-zu iyi-zuy 


690 


BL00303 


S-100/ICaBP type calcium binding protein. 


BL00303B 26.15 4.971e-09 31-67 


690 


BL00292 


Cyclins proteins. 


BL00292A 22.87 5.114e-09 116-149 


691 


PF00756 


Putative esterase. 


PF00756C 14.12 1.108e-09 438-467 




BI 00190 


l^ipaaCD, acinic proteins. 


xoJLUUIZUiJ 11.5 1 4.4oze-0y 435-449 


693 


PR00573 


WTERLEUKIN 8B RECEPTOR 

OXVJXN/\ X U1VC/ 


PR00573C 9.99 7.300e-10 38-46 


693 


PR00427 


lxSrmRLEUKIN-8 RECEPTOR 


PR00427A 16.30 9.700e-10 34-48 


694 


XJJLfU liSJO 


vjj-//\ i / i^/Uo iamiiy 01 nucieosiae 

■nVinQrYhatflCPC nrntpinc 


"DT A1 n DA ii oo o inr\« i c 1 a>i 110 
oJLUlZJoA 11. 11 o.ZU0e-lo 104-1 lo 

txj ni oist^ 1 n 1 o a 1 1da 1 ^ iaq o/:i 
oLUlZjoJJ iu. iy h. ljUe-lJ ZHo-ZOl 

BL01238C 14.36 6.677e-12 219-240 
BL01238B 10.99 2.071e-10 176-186 


695 | 


PR00237 


RHODOPSIN-LDCE GPCR SUPERFAMILY 
^TGNATTTRF 


PR00237F 13.57 5.636e-10 239-263 


695 


BL00237 


G-protein coupled receptors proteins. 


BL00237C 13.19 5.034e-12 234-260 
rJJLUUZj /A Z/.Oo o.ouue-iu /Z-lll 


695 


PR 001 72 


Vji_#UV«/VyOXi XlVrVlNor wJtvlxirv OlVJlNAX UlVD 


Di?nni nic^ o < 1 1 /ci no coo 
rrtuui /zv^ y,Di z.oize-uy 0-Z0 


696 


RT 0061 S 


v^-iypc icuuii uunuun proteins. 


dl»uuoija 10.00 z.uoue-11 i/j-iyz 


698 


BL01238 


GDA1/CD39 family of nucleoside 
puuspxiauibcs proteins. 


BL01238A 11.72 4.240e-16 51-65 

T>T m O^lOT^ in mo n(\i~ 1 a 1 nzr inn 

JtJJLUlzioD 10.19 z.703e-14 196-209 

RT fll 0^80 1/1 o /CAOo 10 1#CO 1 CO 

jdxvUizjoo i*f.Jo z.ooze-iz 10/-I00 

RT 0197RR 10 QQ ^ ^^9*> 17 1 1A MA 

DLuizjoD iu.yy o.jjoe-iz 1Z4-U4 


700 


BL00037 


Myb DNA-binding domain proteins repeat 
proteins proteins. 


BL00037A 16.68 3.571e-ll 231-254 


700 


PF00569 


Zinc finger present in dystrophin, CBP/p300. 


PF00569 13.42 4.214e-10 184-200 


700 


PR00608 


CLASS H CYTOCHROME C SIGNATURE 


PR00608A 13.74 6.434e-10 118-141 


700 


PR00456 


RDBOSOMAL PROTEIN P2 SIGNATURE 


PR00456E 3.06 8.861e-09 123-137 
PR00456E 3.06 9.772e-09 122-136 


701 


PR00049 


WILM'S TUMOUR PROTEIN SIGNATURE 


PR00049D 0.00 1.000e-09 280-294 


703 


PF00650 


CRAL/TRIO domain proteins. 


PF00650D 24.34 1.776e-12 177-210 


703 


PR00180 


CELLULAR RimNALDEHYDE-BINDING 


PR00180A 10.11 7.231e-ll 37-59 
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PROTEIN SIGNATURE 


PR00180D 12.78 9.769e-10 202-221 


705 


PR00910 


LUTEO VIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A2.51 8.286e-09 756-768 


705 


BL00291 


Prion protein. 


BL00291A4.49 8.552e-09 196-230 


706 


BL00400 


LBP / BPI / CETP family proteins. 


BL00400D 23.26 7.222e-12 251-287 


708 


BL00478 


LIM domain proteins. 


BL00478B 14.79 3.000e- 12 31-45 


710 


BL00604 


Synaptopbysin / synaptoporin proteins. 


BL00604F5.967.718e-10 1379-1423 


710 


PR00524 


CHOLECYSTOK1NIN TYPE A RECEPTOR 
SIGNATURE 


PR00524F 5.36 7.415e-09 1220-1233 


710 


BL00242 


Integrins alpha chain proteins. 


BL00242B 8.13 8.615e-09 469-478 


710 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 3.571e-13 1043-1071 
BL00420A 20.42 9.082e- 13 1125-1153 
BL00420A 20.42 2.038e-12 142-170 
BL00420A 20.42 4.462e-12 714-742 
BL00420A 20.42 8.962e-12 454-482 
BL00420A 20.42 9.135e-12 935-963 
BL00420A 20.42 9.827e-12 797-825 
BL00420A 20.42 1.327e-ll 202-230 
BL00420A 20.42 3.29 le- 11 803-831 
BL00420A 20.42 3.618e-l 1 521-549 
BL00420A 20.42 4.927e-ll 589-617 
BL00420A 20.42 6.400e-ll 64-92 
BL00420A 20.42 8.036e-l 1 451-479 
BL00420A 20.42 8.691e- 11 1323-1351 
BL00420A 20.42 9.345e-ll 199-227 
BL00420A 20.42 2.623e-10 944-972 
BL00420A 20.42 2.770e-10 100-128 
BL00420A 20.42 2.770e-10 842-870 
BL00420A 20.42 2.918e-10 741-769 
BL00420A 20.42 4.098e-10 1 137-1 165 
BL00420A 20.42 4.393e-10 696-724 
BL00420A 20.42 4.54le-10 1170-1198 
BL00420A 20.42 5.279e-10 1046-1074 
BL00420A 20.42 5.426e-10 296-324 
BL00420A 20.42 5.426e-10 1149-1177 
BL00420A 20.42 6.754e-10 747-775 
BL00420A 20.42 6.754e-10 1061-1089 
BL00420A 20.42 6.902e-10 1278-1306 
BL00420A 20.42 7.049e-10 624-652 
BL00420A 20.42 7.492e-10 1055-1083 
BL00420A 20.42 8.082e-10 1037-1065 
BL00420A 20.42 8.525e-10 836-864 
BL00420A 20.42 8.672e-10 187-215 
BL00420A 20.42 8.672e-10 598-626 
BL00420A 20.42 8.820e-10 139-167 
BL00420A 20.42 8.820e-10 896-924 
BL00420A 20.42 8.967e-10 717-745 
BL00420A 20.42 9.115e-10 314-342 
BL00420A 20.42 9.705e-10 923-951 
BL00420A 20.42 9.852e-10 369-397 
BL00420A 20.42 9.852e-10 806-834 
BL00420A 20.42 9.852e-10 1179-1207 
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BL00420A 20.42 1.138e-09 863-891 
BL00420A 20.42 1. 4 15e-09 509-537 
BL00420A 20.42 L415e-09 530-558 
BL00420A 20.42 2.523e-09 857-885 
BL00420A 20.42 2.800e-09 1182-1210 
BL00420A 20.42 2.938e-09 1426-1454 
BL00420A 20.42 3.077e-09 630-658 
BL00420A 20.42 3.354e-09 103-131 
BL00420A 20.42 3.492e-09 782-810 
BL00420A 20.42 3. 492e-09 1064-1092 
BL00420A 20.42 3.631e-09 860-888 
BL00420A 20.42 3.769e-09 920-948 
BL00420A 20.42 4.185e-09 869-897 
BL00420A 20.42 4.600e-09 518-546 
BL00420A 20.42 5.015e-09 1317-1345 
BL00420A 20.42 5.292e-09 524-552 
BL00420A 20.42 5.431e-09 633-661 
BL00420A 20.42 5.569e-09 729-757 
BL00420A 20.42 5.569e-09 824-852 
BL00420A 20.42 5.569e-09 1049-1077 
BL00420A 20.42 6.123e-09 366-394 
BL00420A 20.42 6.262e-09 491-519 
BL00420A 20.42 6.538e-09 914-942 
BL00420A 20.42 6.954e-09 566-594 
BL00420A 20.42 6.954e-09 711-739 
BL00420A 20.42 6.954e-09 893-921 
BL00420A 20.42 7.369e-09 818-846 
BL00420A 20.42 7.923e-09 1471-1499 
BL00420A 20.42 8.062e-09 735-763 
BL00420A 20 42 8 477e-09 1347-1375 
BL00420A 20.42 8.754e-09 1095-1123 
BL00420A 20.42 9.031e-09 61-89 
BL00420A 20.42 9.308e-09 31 1-339 
BL00420A 20.42 9.308e-09 938-966 
BL00420A 20.42 9.446e-09 1299-1327 
BL00420A 20.42 9.585e-09 363-391 
BL00420A 20.42 9.723e-09 794-822 
BL00420A 20.42 9.862e-09 1302-1330 


710 


BL01113 


Clq domain proteins. 


BL01113A 17.99 1.290e-15 423-449 
BL01113A 17.99 6.455e-14 1170-1196 
BL01113A 17.99 8.909e-14 509-535 
BL01113A 17.99 8.909e-14 812-838 
BL01113A 17.99 8.909e-14 815-841 
BL01 113A 17.99 3.676e-13 854-880 
BL01113A 17.99 5.622e-13 1040-1066 
BL01113A 17.99 8.054e-13 788-814 
BL01113A 17.99 9.514e-13 589-615 
BL01113A 17.99 9.757e-13 363-389 
BL01113A 17.99 1.923e-12 1405-1431 
BL01113A 17.99 2. 154e- 12 845-871 
BL01113A 17.99 2.615e-12 932-958 
BL01113A 17.99 3.077e-12 953-979 
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s 

1 


BL01113A 17.99 3.308e-12 524-550 
BL01113A 17.99 3.769e-12 566-592 
BL01113A 17.99 3.769e-12 797-823 
BL01113A 17.99 4.231e-12 624-650 
BL01113A 17.99 4.462e-12 1242-1268 
BL01113A 17.99 5.154e-12 639-665 
BL01113A 17.99 5.846e-12 779-805 
BL011 13A 17.99 6.308e-12 598-624 
BL01113A 17.99 6.538e-12 923-949 
BL01113A 17.99 6.538e-12 1046-1072 
BL01113A 17.99 7.462e-12 112-138 
BL01113A 17.99 7.692e-12 705-731 
BL01113A 17.99 8.615e-12 211-237 
BL01113A 17.99 8.846e-12 196-222 
BL01113A 17.99 9.769e-12 460-486 
BL01113A 17.99 1.000e-ll 1296-1322 
BL01113A 17.99 1.205e-ll 1043-1069 
BL01113A 17.99 1.409e-ll 821-847 
BL01113A 17.99 1.614e-ll 1182-1208 
BL01113A 17.99 1.818e-ll 747-773 
BL01113A 17.99 3.659e-ll 451-477 
BL01113A 17.99 4.273e-ll 914-940 
BL01113A 17.99 4.477e-ll 836-862 
BL01113A 17.99 4.886e-ll 729-755 
BL01113A 17.99 5.091e-ll 744-770 
BL01113A 17.99 5.091e-ll 1179-1205 
BL01113A 17.99 5.500e-ll 633-659 
BL01113A 17.99 5.500e-ll 714-740 
BL01113A 17.99 6.523e-ll 1468-1494 
BL01113A 17.99 6.727e-ll 205-231 
BL01113A 17.99 6.727e-ll 824-850 
BL01113A 17.99 7.341e- 11 1423-1449 
BL01113A 17.99 8.364e-ll 595-621 
BL01113A 17.99 9.386e-U 687-713 
BL01113A 17.99 9.795e-ll 690-716 
BL01113A 17.99 1.000e-10 806-832 
BL01113A 17.99 1.383e-10 494-520 
BL01113A 17.99 1.383e-10 803-829 
BL01113A 17.99 1.766e-10 560-586 
BL0U13A 17.99 1.766e-10 1414-1440 
BL01113A 17.99 2.149e-10 938-964 
BL01113A 17.99 2.340e-10 208-234 
BLOl 1 13A 17.99 2.723e-10 64-90 

UTAH HA 17QOO OHf» 1A^70 ^Qfi 

BL01113A 17.99 2.915e-10 592-618 
BL01113A 17.99 2.9l5e-10 1368-1394 
BLOl 1 13A 17.99 3.298e-10 750-776 
BL01113A 17.99 3.872e-l0 518-544 
BL01113A 17.99 5.404e-10 842-868 
BLOl 1 13A 17.99 5.596e-10 857-883 
BL01113A 17.99 6.170e-10 794-820 
BL01113A 17.99 6.745e-10 148-174 



WO 2004/080148 



PCTYUS2003/030720 



295 

TABLE 3A 



SEQ 
ID 


Database 
entry ED 


Description 


Result* 








BL01113A 17.99 6.745e-10 202-228 
BL01113A 17.99 6.745e-10 1251-1277 
BL01113A 17.99 7.319e-10 929-955 
BL01113A 17.99 7.3l9e-10 1305-1331 
BL01113A 17.99 7.51 le-10 432-458 
BLOl 113A 17.99 7.702e-10 563-589 
BL01113A 17.99 7.702e-10 896-922 
BL01113A 17.99 8.085e-10 1176-1202 
BL01113A 17.99 8.277e-10 296-322 
BL01113A 17.99 8.660e-10 1317-1343 
BL01113A 17.99 9.234e-10 121-147 
BL01113A 17.99 9.426e-10 863-889 
BL01113A 17.99 1.346e-09 426-452 
BL01113A 17.99 1.519e-09 454-480 
BL01113A 17.99 1.692e-09 500-526 
BL01113A 17.99 1.692e-09 911-937 
BL01113A 17.99 1.865e-09 782-808 
BL01113A 17.99 2.038e-09 1284-1310 
BLOl 1 13A 17.99 2.212e-09 94-120 
BL01113A 17.99 2.212e-09 1365-1391 
BLOl 113A 17.99 2.385e-09 604-630 
BL01113A 17.99 2.385e-09 893-919 
BL01113A 17.99 2.385e-09 1098-1124 
BL01113A 17.99 2.73 le-09 1161-1187 
BL01113A 17.99 2.904e-O9 1465-1491 
BL01113A 17.99 3.077e-09 506-532 
BL01113A 17.99 3.423e-09 1143-1169 
BL01113A 17.99 3.423e-09 1320-1346 
BL01113A 17.99 3.769e-09 1408-1434 
BL01113A 17.99 3.769e-09 1462-1488 
BL01113A 17.99 3.942e-09 366-392 
BLOl 1 13A 17.99 3.942e-09 902-928 
BL01113A 17.99 3.942e-09 1037-1063 
BLOl 1 13A 17.99 3.942e-09 1 185-1211 
BL01113A 17.99 4.1 15e-09 1290-1316 
BL01113A 17.99 4.462e-09 557-583 
BL01113A 17.99 4.462e-09 575-601 
BL01113A 17.99 4.981e-09 1055-1081 
BL01113A 17.99 5.154e-09 533-559 
BL01113A 17.99 5.327e-09 678-704 
BL01113A 17.99 5.327e-09 1031-1057 
BL01113A 17.99 5.500e-09 187-213 
BL01113A 17.99 5.500e-09 497-523 
BL01113A 17.99 5.500e-09 1332-1358 
BL01113A 17.99 5.673e-09 329-355 
BL01113A 17.99 5.673e-09 899-925 
BL01113A 17.99 6.192e-09 1006-1032 
BL01113A 17.99 6.192e-09 1155-1181 
BLOl 1 13A 17.99 6.365e-09 681-707 
BL01113A 17.99 6.538e-09 723-749 
BL01113A 17.99 6.538e-09 833-859 
BL01113A 17.99 6.712e-09 199-225 
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DT A1 11 1 } A 1*7 An r Tn„ r\n tia ha a 

BJLU1113A 17.99 6.712e-09 720-746 








T>T a i in a 1 *7 c\r\ £ nor _ nn oia one 

BL01113A 17.99 6.885e-09 839-865 








DT A1 t 11 A 1*7 A A *7 ACO« AA 1 AC 1 *71 

BLU1113A 17.99 7.058e-09 145-171 








TIT A1 1 11 A n f\f\ nr AfO- /\A 1 A A O 1 ZT 

BL01113A 17.99 7.058e-09 190-216 








BL01113A 17.99 7.231e-09 1236-1262 








T5T A1 1 1 1 A 1*7 AA *7 Af\A^ AA OIA OC/C 

BL01113A 17.99 7.404e-09 830-856 








T>T At 1 11 A 1*7 AA *7 A- AA CO A *71 A 

BL01113A 17.99 7.750e-09 684-710 








T^T fil 1 11 A 17 QQ *7 Olio HO QfK Q11 








RT /I1 1 1 1 A 1 7 QQ 8 HQrV HQ f.QfLlOO 








BL01113A 17.99 8.269e-09 630-656 








RT 01 1 11 A 17 QQ R ?6Qp-0Q 1257-1981 








BL01113A 17.99 9.308e-09 299-325 








RT 0 1 1 1 3 A 1 7 99 9 308p-09 944-070 








BL01 1 13A 17.99 9.654e-09 457-483 








BL01113A 17 99 1 000e-08 67-93 








BL01113A 17 99 1 000e-08 908-934 


711 


PR00010 


TYPE H EGF-LKE SIGNATURE 


PR00010C 11.16 4.545e-10 211-221 


711 


PD02283 


PROTEIN SPORULATION REPEAT 
PRECU. 


PD02283C 17.54 9.408e-10 3649-3676 


711 


PR00873 


ECHINOIDEA (SEA URCHIN) 
METALLOTHIONEIN SIGNATURE 


PR00873D 8.43 5.500e-09 4326-4344 


711 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907B 11.29 4.974e- 10 4218-4234 
PR00907B 11.29 5.720e-09 162-178 


711 


BL00425 


Arthropod defensins proteins. 


BL00425 10.48 5.781e-09 1216-1234 


711 


PR00261 


LOW DENSITY LIPOPROTEIN (LDL) 
RECEPTOR SIGNATURE 


PR00261C 11.37 4.000e-20 1015-1036 
PR00261D 12.47 5.125e-20 892-913 
PR00261B 14.12 5.588e-20 3600-3621 
PR00261B 14.12 9.294e-20 1101-1122 
PR00261B 14.12 2.667e-19 1053-1074 
PR00261C 1 1.37 3.250e-19 2852-2873 
PR00261A 11.02 7.058e-19 1101-1122 
PR00261A 11.02 8.615e-19 1015-1036 
PR00261B 14.12 9.500e-19 933-954 
PR00261D 12.47 1.500e-18 3721-3742 
PR00261B 14.12 2.263e-18 3523-3544 
PR00261B 14.12 2.421e-18 2729-2750 
PR00261A 11.02 2.833e-18 1144-1165 
PR00261D 12.47 3.000e-18 1015-1036 
PR00261D 12.47 3.167e-18 1053-1074 
PR00261C 11.37 3.618e-18 1053-1074 
PR00261A 11.02 5.000e-18 3600-3621 
PR00261C 11.37 5.582e-18 2809-2830 
PR00261A 11.02 6.000e-18 1053-1074 
PR00261C 1 1 37 6 23fie-18 1101-1122 
PR00261C 11.37 6.89 le- 18 3562-3583 
PR00261A 11.02 7.000e-18 892-913 
PR00261D 12.47 8.167e-18 1144-1165 
PR00261D 12.47 8.333e-18 1101-1122 
PR00261C 11.37 8.527e-18 3484-3505 
PR00261C 11.37 9.018e-18 2767-2788 
PR00261C 11.37 1.310e-17 1144-1165 
PR00261D 12.47 2.579e-17 3600-3621 



I 
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PR00261B 14.12 2.650e-17 3680-3701 i 
PR00261D 12.47 2.737e-17 3680-3701 
PR00261C 11.37 3.017e-17 892-913 
PR00261B 14.12 3.250e-17 892-913 
PR00261A 1 1.02 4.158e-17 3562-3583 
PR00261F 11.57 5.673e-17 2938-2959 
PR00261A 11.02 6.368e-17 2809-2830 
PR00261A 11.02 6.684e-17 3680-3701 
PR00261A 11.02 6.842e-17 3364-3385 
PR00261C 11.37 8.138e-17 3680-3701 
PR00261A 11.02 8.895e-17 2729-2750 
PR00261C 1 1.37 9.845e-17 974-995 ! 
PR00261D 12.47 1.153e-16 2767-2788 
PR00261D 12.47 1.153e-16 3364-3385 
PR00261F 11.57 1.321e-16 1015-1036 
PR00261D 12.47 1.610e-16 2687-2708 
PR00261D 12.47 1.915e-16 974-995 
PR00261F 11.57 1. 964e, 16 2599-2620 
PR00261D 12.47 2.831e-16 2852-2873 
PR00261B 14.12 2.887e-16 3364-3385 
PR00261B 14.12 3.032e-16 2809-2830 
PR00261A 11.02 3.136e-16 80-101 
PR00261D 12.47 3.441e-16 2809-2830 
PR00261D 12.47 3.44 le- 16 3484-3505 
PR00261C 11.37 3.951e-16 2938-2959 
PR00261C 11.37 4.246e-16 80-101 
PR00261D 12.47 4.356e-16 3523-3544 
PR00261E 11.08 5.000e-16 892-913 
PR00261C 11.37 5.279e-16 2729-2750 
PR00261D 12.47 7.407e-16 80-101 
PR00261E 11.08 7.500e-16 3680-3701 
PR00261B 14.12 7.532e-16 2767-2788 
PR00261A 11.02 7.712e-16 3484-3505 
PR00261F 11.57 8.07le-16 1053-1074 
PR00261B 14.12 8.403e-16 1015-1036 
PR00261C 11.37 8.525e- 16 3364-3385 
PR00261F 11.57 8.714e-16 3809-3830 
PR00261A 11.02 8.932e-16 2767-2788 
PR00261F 11.57 9.357e-16 3523-3544 
PR00261D 12.47 1.429e-15 2599-2620 
PR00261B 14.12 1.554e-15 1144-1165 
PR00261A 11.02 1.726e-15 2852-2873 
PR00261D 12.47 1.857e-15 933-954 
PR 0096 IP 1 1 XI 7 O00e-1 S r W*-T544 
PR00261B 14.12 2.108e-15 2599-2620 
PR00261B 14.12 2.246e-15 974-995 
PR00261F 11.57 2.397e-15 3444-3465 
PR00261D 12.47 2.714e-15 3404-3425 
PR00261E 11.08 3.21 le-15 974-995 
PR00261A 11.02 3.323e-15 2687-2708 
PR00261E 11.08 3.526e-15 1053-1074 
PR00261D 12.47 4.429e-15 3562-3583 
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PR00261E 11.08 4.632e- 15 1015-1036 
PR00261D 12.47 5.000e-15 2938-2959 
PR00261C 11.37 5.286e-15 3404-3425 
PR00261E 11.08 5.579e- 15 2599-2620 
PR00261A 11.02 5.645e-15 3523-3544 
PR00261F 11.57 5.966e-15 2638-2659 
PR00261B 14.12 6.262e- 15 2938-2959 
PR00261F 11.57 6.276e-15 2852-2873 
PR00261C 11.37 6.286e-15 2638-2659 
PR00261E 11.08 6.684e-15 1101-1122 
PR00261C 11.37 7.286e-15 3809-3830 
PR00261B 14.12 8.062e-15 3444-3465 
PR00261E 11.08 8.421e-15 1144-1165 
PR00261F 11.57 9.690e-15 2767-2788 
PR00261B 14.12 1.000e-14 80-101 
PR00261F 11.57 1.145e- 14 974-995 
PR00261F 11.57 1.581e-14 3364-3385 
PR00261A 11.02 2.246e-14 933-954 
PR00261C 11.37 2.478e-14 3641-3662 
PR00261B 14.12 2.853e-14 3721-3742 
PR00261A 11.02 3.63 le- 14 2938-2959 
PR00261D 12.47 3.813e-14 2729-2750 
PR00261D 12.47 3.813e-14 3809-3830 
PR00261E 11.08 3.850e-14 2767-2788 
PR00261E 11.08 4.300e-14 2729-2750 
PR00261C 11.37 4.358e-14 3444-3465 
PR00261E 11.08 4.450e-14 2938-2959 
PR00261D 12.47 4.797e-14 2558-2579 
PR00261E 11.08 4.900e-14 3809-3830 
PR00261F 11.57 4.919e-14 1101-1122 
PR00261F 11.57 5.355e-14 3641-3662 
PR00261C 11.37 6.104e-14 2599-2620 
PR00261E 11.08 6.400e-14 3641-3662 
PR00261A 1 1.02 7.092e- 14 3809-3830 
PR00261B 14.12 7.221e-14 3809-3830 
PR00261B 14.12 7.353e-14 3641-3662 
PR00261F 11.57 7.823e-14 1144-1165 
PR00261B 14.12 7.882e- 14 2687-2708 
PR00261E 11.08 8.350e-14 3721-3742 
PR00261E 11.08 8.650e-14 2809-2830 
PR00261D 12.47 9.016e-14 3641-3662 
PR00261C 11.37 9.328e-14 3721-3742 
PR00261D 12.47 9.719e-14 2638-2659 
PR00261C 11.37 1.522e-13 3600-3621 
PR00261F 11.57 2.688e-13 2729-2750 
PR00261E 11.08 2.828e-13 3404-3425 
PR00261A 11.02 2.853e-13 2558-2579 
PR00261B 14.12 2.901e-13 2852-2873 
PR00261E 11.08 2.969e-13 2852-2873 
PR00261E 11.08 2.969e-13 3764-3785 
PR00261A 11.02 3.515e-13 974-995 
PR00261C 11.37 3.609e-13 2687-2708 
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PR00261E 11.08 3.813e-13 3364-3385 
PR00261A 11.02 3.912e-13 3721-3742 
PR00261E 11.08 4.094e-13 1185-1206 
PR00261E 11.08 4.094e-13 2638-2659 
PR00261A 11.02 6.162e-13 3404-3425 
PR00261A 11.02 6.956e-13 2893-2914 
PR00261E 11.08 7.328e-13 3523-3544 
PR00261A 1 1.02 7.485e-13 2599-2620 
PR00261F 11.57 7.891e-13 2558-2579 
PR00261B 14.12 7.972e-13 2638-2659 
PR00261E 11.08 9.6l6e-13 3562-3583 
PR00261E 1 1.08 9.297e-13 2558-2579 
PR00261F 11.57 9.578e-13 3404-3425 
PR00261F 11.57 9.578e-13 3680-3701 
PR00261D 12.47 1.254e- 12 3444-3465 
PR00261F 11.57 1.265e-12 2809-2830 
PR00261C 11.37 1.370e-12 933-954 
PR00261E 11.08 1.545e-12 2687-2708 
PR00261F 11.57 1.926e-12 3562-3583 
PR00261F 11.57 2.456e-12 3721-3742 
PR00261B 14.12 2.603e-12 3562-3583 
PR00261F 11.57 3.382e-12 1185-1206 
PR00261B 14.12 4.205e-12 3404-3425 
PR00261E 11.08 4.955e-12 2893-2914 
PR00261A 11.02 5.310e-12 3641-3662 
PR00261C 11.37 6.178e-12 125-146 
PR00261C 11.37 6.301e-12 1185-1206 
PR00261F 11.57 8.147e-12 3484-3505 
PR00261E 11.08 8.364e-12 80-101 
PR00261E 11.08 8.500e-12 125-146 
PR00261B 14.12 8.644e-12 3484-3505 
PR00261F 11.57 8.676e-12 892-913 
PR00261D 12.47 9.493e-12 2893-2914 
PR00261A 11.02 1.365e-ll 3444-3465 
PR00261F 11.57 1.625e-ll 3764-3785 
PR00261E 11.08 1.643e-ll 3484-3505 
PR00261E 11.08 1.771e-ll 3600-3621 
PR00261A 11.02 2.581e-ll 2638-2659 
PR00261A 11.02 2.824e-ll 1185-1206 
PR00261F 11.57 3.500e-ll 933-954 
PR00261C 11.37 5.263e-ll 2558-2579 
PR00261F 11.57 5.375e-ll 2687-2708 
PR00261D 12.47 7.08 le- 11 125-146 
PR00261A 11.02 7.81 le-11 125-146 
PR00261F 11.57 8.500e-ll 3600-3621 
PR00261E 11.08 9.871e-ll 3444-3465 
PR00261F 1 1.57 2.320e-10 80-101 
PR00261F 11.57 2.920e-10 125-146 
PR00261C 11.37 3.813e-10 2893-2914 
PR00261B 14.12 5.111e-10 2558-2579 
PR00261D 12.47 6.377e-10 3764-3785 
PR00261D 12.47 6.610e-10 1185-1206 
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PR00961T* 14 19 7 667p 1fl 195-146 

PR00261B 14.12 8.889e-10 1185-1206 
PR00261A 11.02 8.962e-10 3764-3785 
PR00261E 11.08 9.137e-10 933-954 
PR00261B 14.12 1. 32 le-09 2893-2914 
PR00261C 11.37 7.429e-09 3764-3785 


711 


BL01177 


Anaphylatoxin domain proteins. 


BL01177C 17.39 7.429e-09 2973-2991 
BL01177C 17.39 8.286e-09 200-218 


711 


BL00799 


Granulins proteins. 


BL00799E 14.64 8.627e-09 1201-1249 


711 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764B 13.56 3.593e-15 1048-1068 
PR00764B 13.56 2.227e-13 3636-3656 
PR00764B 13.56 8.091e-13 1139-1159 
PR00764B 13.56 5.565e-12 928-948 
PR00764B 13.56 7.652e-12 1010-1030 
PR00764B 13.56 8.043e-12 3399-3419 
PR00764B 13.56 2.250e-ll 3595-3615 
PR00764B 13.56 4.000e-ll 3557-3577 
PR00764B 13.56 4.500e-ll 2762-2782 
PR00764B 13.56 6.000e-ll 969-989 
PR00764B 13.56 7.125e-ll 2633-2653 
PR00764B 13.56 8.875e-ll 2724-2744 
PR00764B 13.56 9.625e-ll 887-907 

JrKUU/04r> Ij.jO O.j / /e-lU ZoU4-ZoZ*t 

PR00764B 13.56 1.338e-09 3479-3499 
PR00764B 13.56 1.563e-09 120-140 
PR00764B 13.56 3.025e-09 3439-3459 
PR00764B 13.56 3.925e-09 75-95 
PR00764B 13.56 5.388e-09 2594-2614 
PR00764B 13.56 6.963e-09 2553-2573 

PR00764B 13.56 8.763e-09 3518-3538 


711 


BL01187 


Calcium-binding EGF-like domain proteins 
pattern proteins. 


BL01187B 12.04 8.412e-15 206-221 
BL01187B 12.04 2.333e-12 3019-3034 
BL01187B 12.04 7.300e-ll 3895-3910 
BL01187B 12.04 4.600e-10 2979-2994 
BL01187B 12.04 4.825e-09 3855-3870 
BL01187A 9.98 5.5O0e-O9 3003-3014 
BL01187A 9.98 9.625e-09 190-201 


711 


BL01209 


LDL-receptor class A (LDLRA) domain 
proteins. 


BL01209 9.31 8.313e-16 89-101 
BL01209 9.31 9.438e-16 1062-1074 
BL01209 9.31 3.368e-15 2818-2830 
BL01209 9.31 3.842e-15 1110-1122 
BL01209 9.31 4.316e-15 901-913 
TU ni"7n0 9114 000e-1 4 2608-9670 

BL01209 9.31 4.000e-14 3413-3425 
BL01209 9.31 5.125e-14 3571-3583 
BL01209 9.31 5.500e-14 1194-1206 
BL01209 9.31 7.750e-14 2902-2914 
BL01209 9.31 8.125e-14 3650-3662 
BL01209 9.31 9.250e-14 1153-1165 
BL01209 9.31 1.000e-13 3730-3742 
BL01209 9.31 6.700e-13 2738-2750 
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BL01209 9.31 7.000e-13 3689-3701 
BL01209 9.31 8.500e-13 2696-2708 
BL01209 9.31 3.605e-12 2567-2579 
BL01209 9.31 7.632e-12 3453-3465 ; 
BL01209 9.31 8.105e-12 2776-2788 
BL01209 9.31 8.579e-12 1024-1036 
BL01209 9.31 I.196e-ll 2861-2873 
BL01209 9.31 3.543e-ll 134-146 
BL01209 9.31 5.109e-U 3373-3385 
t>t ni 90Q O ?. 1 f> OR7p-1 1 9947-2959 

BL01209 9.31 6.478e-ll 3609-3621 
BL01209 9 31 9 413e-ll 3773-3785 
BL01209 9.31 1.346e-10 3818-3830 
BL01209 9.31 3.769e-10 3493-3505 
BL01209 9.31 4.115e-10 3532-3544 
BL01209 9.31 4.981e-10 942-954 
BL01209 9.3 1 7.23 le-10 983-995 
BL01209 9.31 9.679e-09 2647-2659 


711 


PR00054 


FUNGAL ZN-CYS BINUCLEAR 
CLUSTER SIGNATURE 


PR00054B 8.73 1.000e-08 3605-3611 


712 


BL01209 


LDL-receptor class A (LDLRA) domain 
proteins. 


BL01209 9.31 8.313e-16 89-101 
BL01209 9.31 3.543e-ll 134-146 


712 


PR00261 


LOW DENSITY LIPOPROTEIN (LDL) 
RECEPTOR SIGNATURE 


PR00261A 1 1.02 3.288e-16 80-101 
PR00261C 11.37 9.1 15e-16 80-101 
PR00261D 12.47 3.286e-15 80-101 
PR00261B 14 12 5 985e-15 80-101 
PR00261C 11 37 6 178e-12 125-146 
PR00261E 11.08 8.227e-12 80-101 
PR00261E 11.08 8.500e-12 125-146 
PR00261F 11.57 6.875e-ll 80-101 
PR00261D 12.47 7.081e-U 125-146 
PR00261A 11.027.811e-ll 125-146 
PR00261F 11.57 2.920e-10 125-146 
PR00261B 14.12 7.667e-10 125-146 


712 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764B 13.56 1.563e-09 120-140 


712 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907B 11.29 5.720e-09 162-178 


714 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 2.765e-25 233-280 
BL00232B 32.79 8.263e-22 458-505 
BL00232B 32.79 4.571e-19 1193-1240 
BL00232B 32.79 8.857e-19 1083-1130 
BL00232B 32.79 2.662e-18 1403-1450 
BL00232B 32.79 5.292e-18 979-1026 
BL00232B 32.79 9.585e-18 1298-1345 
BL00232B 32.79 1.265e-17 672-719 
BL00232B 32.79 1.529e-17 118-165 
BL00232B 32.79 2.588e- 17 776-823 
BL00232B 32.79 1.386e-16 876-923 
BL00232C 10.65 5.390e-12 1081-1098 
BL00232C 10.65 1.391e-ll 334-351 
BL00232C 10.65 2.174e-ll 1296-1313 
BL00232C 10.65 4.522e- 11 1401-1418 
BL00232C 10.65 4.115e-10 977-994 
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BL00232B 32.79 7.200e-10 341-388 
BL00232C 10.65 9.827e-10 670-687 
BL00232C 10.65 4.474e-09 874-891 
BL00232C 10.65 8.737e-09 231-248 


714 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11,39 4.353e-l 1 977-994 
PR00205B 11.39 4.529e-ll 231-248 
ritUUzODlJ 11.39 /.Diye-ll luoi-iuyo 
PR00205B 11.39 1.655e-10 1296-1313 
PROO205B 11.39 4.764e-10 1191-1208 
PR00205B 11.39 5.091e-10 1401-1418 
PR00205B 1 1.39 6.400e-10 456-473 
PR00205B 11.39 l.OOOe-09 334-351 
PR00205B 11.39 1.763e-09 874-891 
PR0O205B 11.39 7.712e-09 563-580 
PR00205B 11.39 9.085e-09 670-687 


715 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 2.765e-25 233-280 
BL00232B 32.79 8.263e-22 458-505 
BL00232B 32.79 4.571e-19 1193-1240 
BL00232B 32.79 8.857e-19 1083-1130 
BL00232B 32.79 2.662e-18 1403-1450 
BL00232B 32.79 5.292e-18 979-1026 
BL00232B 32.79 9.585e-18 1298-1345 
BL00232B 32.79 1.265e-17 672-719 
BL00232B 32.79 1.529e-17 118-165 
BL00232B 32.79 2.588e-17 776-823 
BL00232B 32.79 1.386e-16 876-923 
BLUU232C lU.OO j.JVUe-IZ lUol-ll/yo 
BL00232C 10.65 1.391e-ll 334-351 
BL00232C 10.65 2.174e-ll 1296-1313 
BL00232C 10.65 4.522e-ll 1401-1418 
BL00232C 10.65 4.115e-10 977-994 
BL00232B 32.79 7.200e-10 341-388 
BL00232C 10.65 9.827e-10 670-687 
BL00232C 10.65 4.474e-09 874-891 
BL00232C 10.65 8.737e-09 231-248 


715 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 4.353e-ll 977-994 
PR00205B 11.39 4.529e-ll 231-248 

DDAmA^D 1 t 1Q 1 ^OOa 1 1 1 AQ 1 1AOQ 

rKUUZlDrJ 11.37 /.OZye-ll IU0I-IU70 
X^ivUUZv/jJd 11,07 I.OjjC-IU IZtU-IjIj 

PR00205B 11.39 4.764e-10 1191-1208 
PR00205B 11.39 5.091e-10 1401-1418 
PR00205B 11.39 6.400e-10 456-473 
ppnrnrKT* i no i nnfu no iia 

PR00205B 11 39 1 763e~09 874-891 
PR00205B 11.39 7.712e-09 563-580 
PR00205B 11.39 9.085e-09 670-687 


716 


BL00708 


Prolyl endopeptidase family serine proteins. 


BL00708B 24.91 7.197e-12 706-736 


716 


PF00930 


Dipeptidyl peptidase IV (DPP IV) N-terminal 
region. 


PF009301 15.96 6.373e-17 748-775 
PF00930H 20.16 2.482e-13 669-71 1 
PF00930J 8.78 1 .000e-l 1 800-820 
PF00930G 21.30 9.613e-09 629-666 


717 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 3.118e-14 156-172 
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BL00028 16.07 1.900e-13 352-368 
BL00028 16.07 2.565e-12 240-256 
BL00028 16.07 4.130e-12 212-228 
BL00028 16.07 8.435e-12 324-340 
BL00028 16.07 5.154e-ll 268-284 
BL00028 16.07 6.192e-ll 296-312 
BL00028 16.07 6.885e-l 1 184-200 ! 


717 


JrL/UUUOO 


PROTFTN 7T7sJC-FT>JGFTC MFTAT -RTNDT 


PD00066 13 92 8 800e-14 172-184 
PD00066 13.92 4.857e-12 200-212 
PD00066 13.92 5.286e-12 228-240 
PD00066 13.92 6.143e-12 340-352 
PD00066 13.92 7.000e-12 256-268 
PD00066 13.92 2.957e-ll 312-324 
PD00066 13.92 5.304e-ll 50-62 
PD00066 13.92 7.231e-10 78-90 
PD00066 13.92 3.100e-09 284-296 


717 


PR00048 


C2H2-TYPE ZINC FINGER SIGNATURE 


PR00048A 10.52 5.909e-15 321-334 
PR00048A 10.52 l.OOOe-14 181-194 
PR00048A 10.52 l.OOOe-14 349-362 
PR00048A 10.52 3.571e-13 237-250 
PR00048A 10.52 4.857e- 13 153-166 
PR00048A 10 52 1 947e-ll 209-222 
PR00048A 10.52 3.842e-ll 265-278 
PR00048A 10 52 5 737e-ll 293-306 
PR00048B 6.02 9.308e-l 1 197-206 
PR00048B 6.02 6.063e-10 225-234 
PR00048B 6.02 6.063e-10 365-374 
PR00048B 6.02 8.875e-10 169-178 
PR00048B 6.02 5.737e-09 337-346 
PR00048B 6.02 9.053e-09 309-318 


718 


DM01206 


CORONAVLRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 3.278e-09 70-89 
DM01206B 10.69 4.418e-09 105-124 


718 


BL00048 


Protamine PI proteins. 


BL00048 6.39 7.107e-16 64-90 BL00048 
6.39 9. 196e- 16 63-89 BL00048 6.39 
1.132e-12 62-88 BL00048 6.39 2.059e- 
12 66-92 BL00048 6.39 3.250e-12 65-91 
BL00048 6.39 7.618e-12 92-118 
BL00048 6.39 2.625e-ll 60-86 BL00048 
6.39 6.500e-ll 113-139 BL00048 6.39 
6.750e-ll 78-104 BL00048 6.39 6.875e- 
11 104-130 BL00048 6.39 7.125e-ll 
112-138 BL00048 6.39 8.625e-ll 74-100 
BL00048 6.39 2.539e-10 108-134 
BL00048 6 39 4 434e-10 61-87 BL00048 
6.39 5.855e-10 110-136 BL00048 6.39 
6.921e-10 98-124 BL00048 6.39 7.158e- 
10 109-135 BL00048 6.39 7.750e-10 97- 
123 BL00048 6.39 8.105e-10 79-105 
BL00048 6.39 8.579e-10 19-45 BL00048 
6.39 8.934e- 10 94-120 BL00048 6.39 
9.526e-10 103-129 BL00048 6.39 
1.675e-09 101-127 BL00048 6.39 
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1.900e-09 73-99 BL00048 6.39 3.250e- 

09 81-107 BL00048 6.39 3.475e-09 1 11- 

137 BL00048 6.39 3.700e-09 82-108 

BL00048 6.39 3.700e-09 96-122 

BL00048 6.39 4.263e-09 99-125 

BL00048 6.39 5.163e-09 107-133 

BL00048 6.39 5.275e-09 67-93 BL00048 
& iq ^ 97^/» no an t*t h/wir a 

O.jy j.Z/De-Uy OU-IUO dJUUUUHo O.jy 

5.388e-09 49-75 BL00048 6.39 6.738e- 
09 116-142 BL00048 6.39 7.975e-09 
124-150 BL00048 6.39 8.650e-09 52-78 
BL00048 6.39 8.763e-09 18-44 BL00048 
6.39 9. 100e-09 21-47 BL00048 6.39 

09 100-126 BLO0O48 6.39 9.663e-09 
102-128 BL00048 6.39 1.000e-08 77-103 


720 


PD01719 


PRECURSOR GLYCOPROTEIN SIGNAL 
RE. 


PD01719A 12.89 5.875e-20 1548-1575 
PD01719A 12.89 8.200e-17 1719-1746 
PD01719A 12.89 9.182e-17 1491-1518 
PD01719A 12.89 4.569e-16 1434-1461 
PD01719A 12.89 7.286e-14 1605-1632 
PD01710A 1? 80? 364e-13 1fifi?-1689 


790 


■RT A1 1R7 


V^dlOlUIIl-UiHULLlg CVJTjr-JULBk.C VlUlIlillll piULCIllo 

pattern proteins. 


BL01187B 12.04 3.647e-15 2191-2206 

BL01187B 12.04 5.696e-13 2108-2123 

BL01187B 12.04 7.261e-13 2232-2247 

RT01187A 9 98 4 3l6e-11 2172-2183 
ajjla/ l io / a y ,yo *r*j 1 vie - * 11 it / 1 oj 

BL01 187A 9.98 1.429e-10 2047-2058 
BL01187B 12 04 2 286e-10 2023-2038 
BL01 187A 9.98 1.750e-09 2332-2343 


720 


BL01177 


Anaphylatoxin domain proteins. 


BL01177D 17.50 5.167e-09 2042-2059 


720 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 2.256e-10 1000-1023 
BL00240B 24.70 5.395e-10 450-473 
BL00240B 24.70 3.681e-09 1090-11 13 
BL00240B 24.70 6. 170e-09 634-657 


720 


PR00010 


TYPE II EGF-LKE SIGNATURE 


PR00010C 11.16 2.091e-10 2353-2363 
PR00010C 11.16 6.357e-09 2196-2206 


lis) 


ruyzo /u 


PRECURSOR. 


PDn?R70R 18 83 fi ?04p-1 1 7<v?-7QS 
PD02870B 18.83 8.306e-ll 1126-1158 
PT502870D 1 5 74 4 800e-l 0 1 1 26-1 1 60 
PD02870B 18.83 7.400e-10 393-425 
PD02870B 18 83 9 600e-10 670-702 
PD02870B 18 83 1.862e-09 945-977 
PD02870B 18.83 3.585e-09 1215-1247 
PD02870D 15.74 6.553e-09 854-888 
PD02870B 18.83 6.745e-09 1306-1338 


720 


BL00281 


Bowman-Birk serine protease inhibitors 
family proteins. 


BL00281A 14.18 6.754e-09 2018-2034 


720 


BL00022 


EGF-like domain proteins. 


BL00022B 7.54 1.900e-09 2357-2363 
BL00022B 7.54 7.300e-09 2200-2206 


720 


BL00799 


Granulins proteins. 


BL00799B 11.02 7.429e-09 2014-2049 


720 


DM00864 


EGF-LIKE DOMAIN. 


DM00864B 11.34 7.465e-09 2196-2214 
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720 


PD02327 


GLYCOPROTEIN ANTIGEN PRECURSOR 
IMMUNOGLO. 


PD02327B 19.84 7.818e-09 450-471 


720 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688D 13.44 2.756e-09 679-701 
DM01688G 16.45 6.040e-09 1210-1241 
DM01688D 13.44 8.244e-09 26-48. 


720 


DM00179 


w KINASE ALPHA ADHESION T-CELL. 


DM00179 13.97 5.737e-10 119-128 
DM00179 13.97 9.053e-10 494-503 
DM00179 13.97 6.870e-09 25-34 
DM00179 13.97 8.043e-09 1223-1232 
DM00179 13.97 8.435e-09 401-410 


720 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907B 11.29 2.479e-ll 2344-2360 
PR00907B 11.29 3.688e-10 2228-2244 
PR00907G 11.63 9.660e-10 2348-2374 
PR00907G 11.63 9.745e-10 2232-2258 
PR00907G 1 1.63 9.027e-09 2108-2134 


720 


PD00015 


GLYCOPROTEIN PRECURSOR CELL SI. 


PD00015B 5.21 1.000e-08 1279-1285 


721 


BL00674 


AAA-protein family proteins. 


BL00674B 4.46 1.122e-09 452-473 


721 


BL00300 


SRP54-type proteins GTP-binding domain 
proteins. 


BL00300B 20.56 3.228e-09 452-497 


722 


BL00211 


ABC transporters family proteins. 


BL00211B 13.37 9.053e-22 618-649 
BL00211B 13.37 3.314e-13 1430-1461 
BL00211A 12.23 2.385e-ll 515-526 
BL00211A 12.23 1.529e-10 1327-1338 


722 


PR00326 


GTP 1/OBG GTP-BINDING PROTEIN 
FAMILY SIGNATURE 


PR00326A8.75 1.129e-09 513-533 
PR00326A 8.75 2.671e-09 1325-1345 


722 


BL00649 


G-protein coupled receptors family 2 proteins. 


BL00649F 14.99 4.761e-09 857-878 


723 


BL00130 


Uracil-DNA glycosylase proteins. 


BL00130A 13.75 1.000e-08 576-588 


724 


BL00072 


Acyl-CoA dehydrogenases proteins. 


BL00072E 24.12 5.014e-12 156-198 
BL00072D 30.08 7.136e-10 67-117 


725 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.188e-12 409-421 


725 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 9.816e-12 407-425 


725 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907B 11.29 4.082e-ll 143-159 


725 


PF00094 1 


von Willebrand factor type D domain 
proteins. 


PF00094A 11.09 5.109e-09 138-147 


725 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL00243H 17.53 7.632e-09 68-93 


725 


BL01177 


Anaphylatoxin domain proteins. 


BL01177E 20.64 9.882e-09 145-171 


725 


BL01187 


Calcium-binding EGF-like domain proteins 
pattern proteins. 


BL01187B 12.04 9.100e-14 236-251 
BL01187B 12.04 5.333e-12 191-206 
BL01187B 12.04 6.333e-12 109-124 
BL01187A 9.98 9.250e-09 172-183 
BL01187A9.98 1.000e-08 217-228 


727 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 6.108e-22 898-938 
PD00930A 25.62 3.415e-14 775-800 


727 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479B 12.57 4.706e-12 724-739 


727 


PF00620 


GTPase-activator protein for Rho-like 
GTPases. 


PF00620B 14.20 6.000e-10 825-841 


727 


BL01240 


Purine and other phosphorylases family 2 
proteins. 


BL01240C 25.01 1.414e-09,36-77 


729 


BL00142 


Neutral zinc metallopeptidases, zinc-binding 


BL00142 8.38 8.875e-l 0412-422 
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729 


PD01719 


PRECURSOR GLYCOPROTEIN SIGNAL 
RE. 


PD01719A 12.89 4.150e-15 572-599 
PD01719A 12.89 3.487e-10 1222-1249 
PD01719A 12.89 6.447e-10 1166-1193 
PD01719A 12.89 1.778e-09 1425-1452 
PD01719A 12.89 7.556e-09 1091-1118 


735 


BL00741 


Guanine-nucleotide dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 1.333e-14 302-324 


742 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 3.571e-13 656-673 
PR00205B 11.39 9.357e-13 233-250 
PR00205B 11.39 9.413e-12 339-356 
PR00205B 11.39 7. 05 5e- 10 450-467 
PR00205B 11 39 8 691 e- 10 553-570 


742 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 8.615e-24 235-282 
BL00232B 32.79 3.631e-18 555-602 
BL00232B 32.79 9.862e-18 452-499 
BL00232B 32.79 2.1 lOe- 15 125-172 
BL00232C 10.65 6.500e-13 233-250 
BL00232C 10.65 8.750e-13 656-673 
BL00232C 10.65 6.087e-ll 339-356 
BL00232C 10.65 9.827e- 10 450-467 


745 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 4.375e-15 216-232 
BL00028 16.07 8.3l3e-15 518-534 
BL00028 16.07 1.529e- 14 244-260 
BL00028 16.07 1.000e-13 188-204 
BL00028 16.07 2.350e-13 272-288 
BL00028 16.07 l.OOOe- 12 412-428 
BL00028 16 07 2 957e-12 356-372 
BL00028 16.07 2.957e-12 490-506 
BL00028 16.07 2.957e-12 546-562 
BL00028 16.07 3.348e-12 384-400 
BL00028 16.07 4.522e-12 300-316 
BL00028 16.07 6.870e-12 328-344 
BL00028 16.07 1.000e-ll 160-176 
BL00028 16.07 3.400e-10 440-456 
BL00028 16.07 1.000e-09 132-148 


745 


PR00048 


C2H2-TYPE ZINC FINGER SIGNATURE 


PR00048A 10.52 5.091e-15 381-394 
PR00048A 10.52 6.727e-15 269-282 
PR00048A 10.52 6.727e-15 543-556 
PR00048A 10.52 7.545e-15 487-500 
PR00048A 10.52 9.182e-15 185-198 
PR00048A 10.52 6.l43e-13 213-226 
PR00048A 10.52 7.429e-13.409-422 
PR00048A 10.52 8.714e-13 241-254 
PR00048A 10.52 8.714e-13 297-310 
PR00048A 10.52 4.706e-12 353-366 
PR00048B 6.02 6.000e-12 173-182 
PR00048B 6.02 3.077e-ll 341-350 
PR00048B 6.02 7.923e-ll 503-512 
PR00048B 6.02 1.000e-10 229-238 
PR00048A 10.52 4.522e-10 515-528 
PR00048A 10.52 6.870e-10 129-142 
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PR00048B 6.02 8.875e-10 531-540 
PR00048A 10.52 1.720e-09 157-170 
PR00048A 10.52 2.800e-09 437-450 
PR00048B 6.02 2.895e-09 453-462 
PR00048B 6.02 5.737e-09 313-322 
PR00048A 10.52 6.760e-09 325-338 


745 


PD00066 


PROTEIN ZINC-FINGER METAL-BINDI. 


PD00066 13.92 5.200e-14 176-188 
PD00066 13.92 8.200e-14 344-356 
PDOOOfifi 13 92 4 000e-13 232-244 
PD00066 13.92 1.857e-12 456-468 
PD00066 13 92 3 571e-12 534-546 

JL JL/v vvvv X -mJ - *F * • *m+ f Xw V *mmt *f *J t v ■ v 

PD00066 13.92 4.000e-12 400-412 
PD00066 13.92 l.OOOe-ll 260-272 
PD00066 13.92 l.OOOe-ll 372-384 
PD00066 13.92 4.522e-ll 204-216 
PD00066 13.92 l.OOOe-10 288-300 
PD00066 13.92 7.300e-09 506-518 


746 


PD01066 


PROTEIN ZINC FINGER ZINC-FINGER 
METAL-BINDING NU. 


PD01066 19.43 8.250e-35 37-75 


746 


PDOUOoo 




PD00066 13 92 5 200e-14 251-263 
PD00066 13.92 8.200e-14 419-431 
PD00066 13.92 4.000e-13 307-319 
PD00066 13.92 1.857e-12 531-543 
PD00066 13.92 4.000e-12 475-487 
PD00066 13.92 l.OOOe-11 335-347 
PD00066 13.92 l.OOOe-11 447-459 
PD00066 13.92 4.522e-ll 279-291 
PD00066 13.92 l.OOOe-10 363-375 


746 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 4.375e-15 291-307 
BL00028 16.07 1.529e-14 319-335 
BL00028 16.07 1.000e-13 263-279 
BI 00028 16 07 2 350e-13 347-363 
BL00028 16.07 l.OOOe- 12 487-503 
BL00028 16,07 2.957e-12 431-447 
BL00028 16.07 3.348e-12 459-475 
BL00028 16.07 4.522e-12 375-391 
BL00028 16.07 6.870e-12 403-419 
BL00028 16.07 l.OOOe-11 235-251 
BL00028 16.07 3.400e-10 515-531 
BL00028 16.07 1.000e-09 207-223 


746 


PR00048 


C2H2-TYPE ZINC FINGER SIGNATURE 


PR00048A 10.52 5.091e-15 456-469 
PR00048A 10.52 6.727e-15 344-357 
PR00048A 10.52 9.182e-15 260-273 
PR00048A 10.52 6.143e-13 288-301 
PR00048A 10.52 7.429e-13 484-497 
PR00048A 10.52 8.714e-13 316-329 
PR00048A 10.52 8.714e-13 372-385 
PR00048A 10.52 4.706e-12 428-441 
PR00048B 6.02 6.000e-12 248-257 
PR00048B 6.02 3.077e-ll 416-425 
PR00048B 6.02 l.OOOe-10 304-313 
PR00048A 10.52 6.870e-10 204-217 
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PR00048A 10.52 1.720e-09 232-245 
PR00048A 10.52 2.800e-09 512-525 
PR00048B 6.02 2.895e-09 528-537 
PR00048B 6.02 5.737e-09 388-397 
PR00048A 10.52 6.760e-09 400-413 


747 


PF01105 


emp24/gp25L/p24 family. 


PF01105B 25.12 2.868e-25 144-195 


749 


PR00405 


HTV REV INTERACTING PROTEIN 
SIGNATURE 


PR00405C 19.41 1.000e-18 579-600 
PR00405A 17.71 8.147e-18 539-558 
PR00405B 11.83 7.300e-17 558-575 


749 


PF00791 


Domain present in ZO-1 and Unc5-like netrin 
receptors. 


PF00791B 28.49 7.688e-09 831-885 


751 


PD01066 


PROTEIN ZINC FINGER ZINC-FINGER 
METAL-BINDING NU. 


PD01066 19.43 6.143e-21 344-382 


751 


PD00066 


PROTEIN ZINC-FINGER METAL-BINDI. 


PD00066 13.92 8.500e-13 769-781 
PD00066 13 92 4 857e-12 711-723 


751 


PR00048 


C2H2-TYPE ZINC FINGER SIGNATURE 


PR00048A 10.52 4.706e-12 778-791 
PR00048B 6.02 6.538e-ll 766-775 
PR00048A 10.52 1.000e-10 750-763 
PR00048A 10.52 4.130e-10 602-615 
PR00048B 6.02 6.063e-10 708-717 
PR00048A 10.52 8.043e-10 630-643 
PR00048A 10.52 8.435e-10 692-705 
PR00048A 10.52 1.360e-09 720-733 


751 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 3.118e-14 753-769 
BL00028 16.07 1.346e-ll 781-797 
BL00028 16.07 3.769e-ll 605-621 
BL00028 16.07 9.400e-10 723-739 
BL00028 16.07 1.771e-09 695-711 


754 


BL01177 


Anaphylatoxin domain proteins. 


BL01177E 20.64 4.541e-13 790-816 


754 


BL00477 


Alpha-2-macroglobulin family thiolester 
region proteins. 


BL00477J 19.04 3.382e-27 1241-1271 
BL00477F 17.34 8.500e-25 785-814 
BL00477G 19.43 8.826e-23 983-1014 
BL00477A 13 50 9 800e-23 122-150 
BL00477L 23.51 5.500e-16 1437-1469 
BL00477K 17.42 4.529e-14 1382-1405 
BL00477E 17.53 6.538e-13 755-775 
BL00477B 9.05 6.625e-13 209-221 
BL004771 18.76 2.650e-12 1085-1111 
BL00477D 12.73 4.073e-12 729-738 
BL00477H 9.07 5.395e-12 1054-1065 
BL00477C 15.70 1.161e-10 236-252 


755 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514E 14.28 7.750e-12 299-315 
BL00514D 15.35 9.824e-ll 280-292 
BL00514G 15.98 4.273e-10 362-391 
BL00514H 14.95 6.217e-09 397-421 


756 


BL00790 


Receptor tyrosine kinase class V proteins. 


BL007901 20.01 7.638e-10 868-898 


756 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR. 


PD02870B 18.83 5.309e-09 371-403 


756 


DM00179 


w KINASE ALPHA ADHESION T-CELL. 


DM00179 13.97 7.261e-09 189-198 


756 


PR00014 


FBRONECTIN TYPE EH REPEAT 
SIGNATURE 


PR00014B 14.77 6.400e-10 832-842 
PR00014D 12.04 3.700e-09 671-685 
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PR00014C 15.44 4.522e-09 857-875 
PR00014D 12.04 8.200e-09 875-889 
PR00014D 12.04 9.550e-09 774-788 


757 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 2.149e-09 306-329 


757 


PR00019 


LEUCINE-RICH REPEAT SIGNATURE 


PR000 19A 11.19 1 .450e- 1 1 149- 162 
PR00019B 11.36 5.050e-10 98-111 
PR00019B 11.36 7.840e-09 122-135 


758 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 2.149e-09 306-329 


758 


PR00019 


LEUCINE-RICH REPEAT SIGNATURE 


PR00019A 11.19 1.450e-ll 149-162 
PR00019B 11.36 5.050e- 10 98-111 
PR00019B 11.36 7.840e-09 122-135 


759 


BL00649 


G-protein coupled receptors family 2 proteins. 


BL00649C 17.82 4.339e- 11 1086-1111 


759 


PR00249 


SECRETIN-LDCE GPCR SUPERFAMILY 
SIGNATURE 


PR00249C 17.08 4.185e-10 1088-1111 


760 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 8.313e-15 277-293 
BL00028 16.07 1.900e-13 193-209 
BL00028 16.07 6.400e-13 137-153 
BL00028 16.07 6.400e-13 389-405 
BL00028 16.07 4.913e-12 109-125 
BL00028 16.07 8.826e-12 333-349 
BL00028 16.07 1.000e-ll 361-377 
BL00028 16.07 1.692e-ll 249-265 
BL00028 16.07 3.077e-ll 221-237 
BL00028 16.07 6.538e-ll 305-321 
BL00028 16.07 7.577e-ll 165-181 


760 


PD00066 


PROTEIN ZINC-FINGER METAL-BINDI. 


PD00066 13.92 4.000e-14 265-277 
PD00066 13.92 5.200e-14 97-109 
PD00066 13.92 5.200e-14 293-305 
PD00066 13.92 5.200e-14 321-333 
PD00066 13.92 2.000e-13 209-221 
PD00066 13.92 3.500e-13 181-193 
PD00066 13.92 1.000e-12 377-389 
PD00066 13.92 4.857e-12 237-249 
PD00066 13.92 7.857e-12 125-137 
PD00066 13.92 8.826e-ll 405-417 
PD00066 13.92 5.200e-09 349-361 


760 


PR00048 


C2H2-TYPE ZINC FINGER SIGNATURE 


PR00048A 10.52 5.500e-14 330-343 
PR00048A 10.52 7.000e-14 246-259 
PR00048A 10.52 9.250e-14 190-203 
PR00048A 10.52 1.643e-13 218-231 
PR00048A 10.52 4.857e-13 274-287 
PR00048A 10.52 1.000e-12 106-119 
PR00048B 6.02 6.000e-12 94-103 
PR00048B 6.02 6.000e-12 402-411 
PR00048A 10.52 4.789e-ll 134-147 
PR00048B 6.02 5.846e-ll 290-299 
PR00048B 6.02 5.846e-ll 374-383 
PR00048A 10.52 9.526e-ll 386-399 
PR00048A 10.52 1.391e-10 302-315 
PR00048A 10.52 1.783e-10 162-175 
PR00048A 10.52 7.261e-10 414-427 
PR00048B 6.02 8.875e-10 318-327 
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PR00048B 6.02 5.737e-09 262-271 


760 


PD02462 


PROTEIN BOLA TRANSCRIPTION 
REGULATION AC. 


PD02462A 22.48 1.768e-09 270-304 
PD02462A 22.48 6.488e-09 298-332 


761 


PR00121 


SODIUM/POTASSIUM-TRANSPORTING 
ATPASE SIGNATURE 


PR00121D 16.72 6.844e-15 173-194 


761 


BL00154 


E1-E2 ATPases phosphorylation site proteins. 


BL00154E 20.37 2.929e-13 446-486 
BL00154C 12.38 1.540e-12 176-194 


761 


PR00119 


P-TYPE CATION-TRANSPORTING 
ATPASE SUPERFAMELY SIGNATURE 


PR00119B 13.94 7.245e-12 180-194 


761 


BL01228 


Hypothetical cof family proteins. 


BL01228D 17.44 6.348e-09 595-619 


763 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 7.686e-09 172-188 


764 


BL00892 


HIT family proteins. 


BL00892A 18.17 2.125e-10 177-207 


764 


BL00064 


L-lactate dehydrogenase proteins. 


BL00064F 25.14 7.720e-09 295-339 


767 


PD02102 


SUBUNIT E V- ATPASE VACUOLAR ATP 
SYNTHASE HYDROL. 


PD02102A 16.74 8.318e-09 121-164 


768 


BL00926 


Lysyl oxidase copper-binding region 
proteins. 


BL00926E 14.42 2.976e-22 306-342 
BL00926D 9.03 6.336e-14 260-306 


768 


PR00074 


LYSYL OXIDASE SIGNATURE 


PR00074C 8.72 2.674e-18 311-339 
PR00074A 9.55 2.514e-10 255-283 


768 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420B 22.67 5.500e-29 33-87 
BL00420C 11.90 8.017e-ll 118-128 
BL00420B 22.67 3.526e-10 147-201 


768 


PR00258 


SPERACT RECEPTOR SIGNATURE 


PR00258A 11.46 5.721e-ll 139-155 
PR00258E 13.33 7.000e-U 117-129 
PR00258B 9.63 2.180e-1048-59 
PR00258C 9.05 2.469e-10 63-73 
PR00258A 1 1.46 2.746e-10 29-45 
PR00258D 14.41 4.724e-10 94-108 
PR00258D 14.41 7.429e-09 210-224 


773 


BL01315 


Phosphatidate cytidylyltransferase proteins. 


BL01315C 18.61 1.000e-40 342-385 
BL01315A 22.47 8.650e-28 221-252 
BL01315B 10.40 1.000e-17 253-266 


774 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320A 16.74 5.655e- 11 190-204 
PR00320C 13.01 8.560e-10 190-204 
PR00320B 12.19 8.425e-09 190-204 


779 


BLOl 152 


Hypothetical hesB/yadR/yfhF family proteins. 


BL01152B 20.12 1.581e-17 70-95 
BL01152C 25.93 1.659e-ll 103-149 


783 


BL00280 


Pancreatic trypsin inhibitor (Kunitz) family 
proteins. 


BL00280 24.61 7.070e-26 547-590 


783 


PR00453 


VON WILLEBRAND FACTOR TYPE A 
DOMAIN SIGNATURE 


PR00453A 12.79 3.483e-14 265-282 


783 


PR00759 


BASIC PROTEASE (KUNITZ-TYPE) 
INHIBITOR FAMILY SIGNATURE 


PR00759C 14.15 1.205e-10 575-590 
PR00759B 1 1.26 7.968e-10 565-575 


783 


BLOl 113 


Clq domain proteins. 


BL01113A 17.99 4.447e-10 54-80 
BL01113A 17.99 4.638e-10 100-126 
BLOl 1 13A 17.99 7.702e-10 57-83 
BL01113A 17.99 1.865e-09 106-132 
BL01113A 17.99 3.250e-09 60-86 
BL01113A 17.99 3.250e-09 213-239 
BLOl 1 13A 17.99 3.423e-09 34-60 
BL01113A 17.99 6.365e-09 198-224 
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BL01113A 17.99 7.231e-09 109-135 


/oi 


T)T An/ion 
BLUU4ZU 


OpCraCl rCCCpLOr repeal pAUlt»xUO uvxxiaixx 

proteins. 


BL00420A 20.42 3.213e-10 16-44 
BL00420A 20.42 1.415e-09 100-128 
BL00420A 20.42 7,923e-09 216-244 
BL00420A 20.42 8.477e-09 169-197 


/OJ 




xvCCCpiUl lylUoxxlC JSJXluoCf Vsiaoa xxx ^ivi>wuu> 


BL00240B 24.70 5.404e-09 336-359 


786 


PR00918 


calicivirus non-structural 
polyprotein family signature 


PR00918A 13.76 4.284e-12 27-47 


/oo 


BLOll/o 


Chi Inmola Irinona tM*Ar01t1C 

oniiornaie Kinase proieina. 


BL01128A 18 84 6 684e-ll 394-427 


loo 


BL0U795 


invoiucnn proteins. 


BL00795C 17 06 8 000e-ll 191-235 


786 


BL00300 


SRP54-type proteins GTP-binding domain 
proteins. 


BL00300B 20.56 4.032e-10 391-436 


786 


PR00830 


ENDOPEPTID ASE LA (LON) SERINE 

"DO rYTT? A QT-? /CI/TV Q1YTM A TT TT? P 
rKUiJbAMi volOJ MUTviAlUrUS 


PR00830A 8.41 4.452e-09 37-56 


786 


BL00113 


Adenylate kinase proteins. 


BL00113A 12.74 3.782e-ll 34-50 
BL00113B 20 49 4 974e-ll 58-101 
BL00113A 12.74 5.431e-09 395-411 


786 | 




A/vrv-protein ianuiy proieinb. 


BL00674B 4.46 5.986e-09 30-51 


786 


PR00819 


CBXX/CFQX SUPERFAMILY 


PR00819B 10.83 7.247e-09 32-47 


786 


PR00364 


DISEASE RESISTANCE PROTEIN 
SIGNATURE 


PR00364A 8.19 8.057e-09 32-47 


786 


rKUU44V 


xi? A>jQT7nPA/rrwo PT?nTFrw P9 1 T?A9 

1 iv/\iN o r VJrvlviliN kj l lwj i ij/i in l a l rvrvo 


PR00449A 13 20 8 914e-09 31-52 


788 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002B 15.18 l.OOOe- 10 42-55 
BL50002A 14.19 3.813e-09 4-22 


789 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002B 15.18 1.000e-10 115-128 
BL5O0O2A 14.19 3.813e-09 77-95 


790 


BL00288 


Tissue inhibitors of metalloproteinases 
proteins. 


BL00288A 17.47 9.143e-21 10-39 
RT 0098SP 14 ft? d S00e-18 73-87 
BL00288B 9.44 7.000e- 1 5 54-64 


791 


tit rvrv/f 1 c 

BL00615 


C-type lectin aomaui proteins. 


RTfinfilSA 16 68 2 080e-ll 156-173 


792 


BL00375 


UDP-glycosyltransferases proteins. 


BL00375F 16.99 1.000e-40 270-314 
RT00Y7SG 11 01 1 000e-40 369-408 
RT 00175E 18 75 3 250e-37 215-264 
BL00375D 14.56 5.622e-24 175-202 
BL00375C 18.27 6.478e-24 110-133 
BL00375B 21.22 5.000e-22 47-87 


794 


"dt ni 1 ci 


uDLCr/v^iJv^j ineiuyiu;aiioiciai>c IdllLU Y 
proteins. 


BL0H83B 21 31 6 660e-12 143-187 


794 


"DT AI T70 

BLUlZ/y 


r rote in- Lr is oaspanaie^jj- aspartate j w~ 

IIlCLiiyiUaiioiClaoc aigiia. 


BL01279A 24 27 5 862e-ll 57-104 


795 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 3.045e-21 494-533 


795 


PR00237 


RHODOPSIN-LIKE GPCR SUPERPAMILY 
SIGNATURE 


PR00237C 15.69 2.000e-12 508-530 
PR00237B 13.50 4.414e-ll 463-484 
PR00237D 8.94 5.050e-ll 544-565 


796 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688B 15.06 2.500e-10 82-129 


797 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688B 15.06 2.5O0e-10 82-129 


798 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688B 15.06 3.628e-09 82-129 


802 


PF00997 


Kappa casein. 


PF00997D 9.95 8.306e-09 506-540 


804 


PD02080 


T-CELL GLYCOPROTEIN CD8 CHAIN 


PD02080B 20.69 9.716e-09 20-58 
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JvGS lilt 






CTTOT7 ATPTTA PPF 




804 


PD01270. 


RECEPTOR FC IMMUNOGLOBULIN 

AT7T7TKT 


PD01270A 17.22 9.806e-09 19-58 


805 


BL00982 


Bacterial-type phytoene dehydrogenase 
proteins. 


BL00982E 9.88 4.857e-ll 24-39 


806 


BL(J0z4J 


Integrins beta chain cysteine-rich domain 
proteins. 


T*T n094^W 17 « C 6Q6e> 11 70 07 
OlAjxJZ.HOil i/.JD o.OyOe-11 /Z-y/ 


807 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


P.T (\(Y)A'VH 17 « C 1 1 77 07 

xji^uuz^fori i /.jo o.oyo©-i i /z-y/ 


808 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL00243H 17.53 8.696e-ll 72-97 


812 


BL00240 


Receptor tyrosine kinase class m proteins. 


BL00240B 24.70 2.674e-10 279-302 

TJT AAlvlAD O/l *7A O CK- 1ft 11A Idl 
TIT AA7/1AH 1A 7ft n 7A7*» AO AH(\ AQ1 

ULrUUZ4Ui5 Z4. /u /./uze-uy 


812 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR 


PD02870B 18.83 4.600e-10 512-544 
PD02870B 18.83 7.894e-09 120-152 


813 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR. 


PD02870B 18.83 4.600e-10 2395-2427 
PD02870B 18.83 4.160e-09 1707-1739 
PD02870B 18.83 5.883e-09 1806-1838 
PD02870B 18.83 7.894e-09 2003-2035 

TJT\AT QH/YD 1 O Q1 1 QQQ a AO A1G A&l 


813 


PD00015 


GLYCOPROTEIN PRECURSOR CELL SI. 


PD00015B 5.21 8.000e-09 1481-1487 


813 


BL00240 


Receptor tyrosine kinase class III proteins. 


BL00240B 24.70 2.25oe-10 Ioo7-lo90 
BL00240B 24.70 2.674e-l0 2162-2185 
BL00240B 24.70 8.535e-10 2257-2280 
BL00240B 24.70 4.064e-09 1570-1593 
BL00240B 24.70 5.213e-09 300-323 

"DT AA7/IAI3 7/1 7A 7 7A7*» AO 77^1 777£ 

oL.uuz4UJt> Z4. /u /. /uze-uy ZJDJ-Zj /o 

PT nnOztHR 1A 70 R 147^-1406 
JDlA/UZ'tU.D Z*t. /U 0.oJlC"l/7 l*t/0-l*t-70 


814 


PR00500 


POLYCYSTIC KIDNEY DISEASE 
PROTEIN SIGNATURE 


PR00500B 7.74 6.305e-09 220-240 


814 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR. 


PD02870B 18.83 4.600e-10 2590-2622 
PD02870B 18.83 4.160e-09 1902-1934 
PD02870B 18.83 5.883e-09 2001-2033 
PD02870B 18.83 7.894e-09 2198-2230 

t>TW>Q7A"D 1 C C7 7 OfiQo AO (\*X(\ AAO 

irJ-JUZo lo.oj /.yoye-uy oju-ooz 


814 


PD00015 


GLYCOPROTEIN PRECURSOR CELL SL 


PD00015B 5.21 8.000e-09 1676-1682 


814 


BL00240 


Receptor tyrosine kinase class III proteins. 


BLU0z4ufc> Z4. A) Z.Zjoe-lU looZ-looj 
BL00240B 24.70 2.674e-10 2357-2380 

r>T AAO/1AT1 7/1 7A C 1 A 7/l<7 7/17^ 

dJ_A)UZ4U£> Z4. /U B.DJje-AU Z43Z-Z4/J 

BL00240B 24.70 4.064e-09 1765-1788 

DT AA*>/t AT3 7/1 7A ^71 1o AO /1G< < 1 C 

dLUUz4ud Z4. /u j.zi je-uy 4yj-Mo 
BL00240B 24 70 7 702e-09 2548-2571 
BL00240B 24.70 8.851e-09 1668-1691 


816 


PD01733 


APOLIPOPROTEIN PLASMA LIPID 
TRANSPORT H. 


PD01733B 20.44 6.600e-14 75-129 


816 


PD02807 


APOLIPOPROTEIN E PRECURSOR APO- 
E GLYCOPROTEIN PLAS. 


PD02807D 7.99 4.779e-09 92-141 


817 


PD01733 


APOLIPOPROTEIN PLASMA LIPID 
TRANSPORT H. 


PD01733B 20.44 6.600e-14 75-129 


817 


PD02807 


APOLIPOPROTEIN E PRECURSOR APO- 


PD02807D 7.99 4.779e-09 92-141 
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819 


PR00389 


PHOSPHOLIPASE A2 SIGNATURE 


PR00389C 18.33 3.172e-20 56-74 
PR00389B 10.70 8.154e-15 37-55 
PR00389E 12.52 5.385e-14 104-120 


O 1 A 

819 


TIT An 1 1 o 

BL00118 


Phospholipase A2 histidine proteins. 


RT HO 1 1 HP* 1 A ^3 ^ C7^*» 71 AA 7 1 

BL00118D 12.85 7.500e-14 104-119 
p»t 001 1 Rr* i ^ qo a 149a_i o 7Q-Q7 


821 


BL00908 


Mandelate racemase / muconate lactonizing 
enzyme family signa. 


BL00908B 37.71 1.900e-15 209-263 
BL00908A 15.14 5.310e-10 87-113 


822 


PF00956 


Nuclesosome assembly protein (NAP). 


PF00956B 23.14 1.000e-40 99-139 

PF00956A 11.88 1.000e-13 58-68 
PFOOQ^fiH 7 SI ^ 700p-19 919-949 


822 


BL00824 


Elongation factor 1 beta/beta'/delta chain 
proteins. 


BL00824B 9.21 3.676e-09 286-305 


823 


BL01032 


Protein phosphatase 2C proteins. 


BL01032C 6.14 3.195e-12 147-156 
BL01032H 11.25 5.680e-ll 318-330 

di (MCMCi 8 V* 8 Q39p-1 1 989-9QS 

DL/U1UJZ.I lv.t£r O.^Vr4.w-\/^ J Is JOO 


824 


PF00094 


von Willebrand factor type D domain 
proicms. 


PF00094C 12.88 1.918e-09 124-133 


824 


PD02576 


PRECURSOR GLYCOPROTEIN SIGNAL 
CELL. 


PD02576A 27.60 9.057e-09 101-149 


825 


PR00245 


OLFACTORY RECEPTOR SIGNATURE 


PR00245C 7.84 5.355e-17 121-136 

PP009ASR 10 1R % Q1Qf»-19 60-74 

PR00245E 12.40 1.000e-10 174-188 


olj 


rJLUUz J / 


G-protein coupled receptors proteins. 


RT00917T> 1 1 T\ 9 0Q1<*-0Q 16^-181 


oZj 


rKU UZ j / 


SIGNATURE 


PP009T7H 10 fil R 714f»-1 1 1SS-181 

PR00237E 13.03 9.735e-09 82-105 


826 


PR00245 


OLFACTORY RECEPTOR SIGNATURE 


PR00245C 7.84 5.355e-17 235-250 

PR00245B 10.38 3.919e-12 174-188 
PR00945F 12 40 1 000e-10 288-302 


826 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 1.581e-15 89-128 
BL00237D 11 23 2 091e-09 279-295 


826 


PR00896 


VASOPRESSIN RECEPTOR SIGNATURE 


PR00896B 9.01 8.962e-09 54-65 


826 


PR00237 


RHODOPSIN-LKE GPCR SUPERFAMDLY 


PR00237G 19.63 8.714e-ll 269-295 
PP009V7P 1S fiQ 3 R29e-10 101-125 
PR00917F 13 03 9 735e-09 196-219 


827 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL00243H 17.53 5.650e-14 39-64 
BL00243H 17.53 4.261e-ll 5-30 


828 




PPrvTPTTsJ fiTPAQP DHMATM 
x^KL/lxiliN Lrlrv\oJtS JJVJiYLfVllN 

ACTIVATION 


PnflOO^OR 33 79 7 070p-19 901-941 


831 


PR00193 


MYOSIN HEAVY CHAIN SIGNATURE 


PR00193C 12.60 1.383e-23 177-204 
PR00193B 11.69 2.212e-18 125-150 
PR00193A 15.41 5.925e-12 65-84 


831 


BL00567 


Phosphoribulokinase proteins. 


BL00567A 10.66 9.031e-10 127-145 


832 


PR00193 


MYOSIN HEAVY CHAIN SIGNATURE 


PR00193C 12.60 1.383e-23 177-204 
PR00193B 11.69 2.212e-18 125-150 
PR00193A 15.41 5.925e-12 65-84 


832 


BL00567 


Phosphoribulokinase proteins. 


BL00567A 10.66 9.031e-10 127-145 
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ID 


Database 
entry ID 


uescripuon 


ACS III I 


Rid 




i nyrogioouiLD typc-i icpcai piuiciiia piuiAviAio. 


RT 00484P 17 01 3 647e- 12 358-372 
BL00484B Q 04 4 S29e-1 1 338-351 






TITaryol o^rin#» T\rAtf»ncp inrilViltArc familv 
JVaZal bCilllC piULCaoC LLLLUUllUio lalllliy 

proteins. 


BL00282 16 88 3 880e-09 143-165 




rt oo^i 9 

X5JLVJUU1Z 


vybLCOELCCLlll LIU Ilia. Ill piUlwllla. 


BL00612E 13 12 8 230e-09 274-318 


835 


BL00817 


Erythropoietin / thrombopoeitin proteins. 


BL00817A 18.03 8.200e-10 515-545 


835 


PR00251 


BACTERIAL OPSIN SIGNATURE 


PR00251A 12.15 8.820e-10 515-534 


835 


PR00807 


POLLEN ALLERGEN AMB FAMILY 
SIGNATURE 


PR00807A 16.64 8.151e-09 459-476 


836 


BL00817 


Erythropoietin / thrombopoeitin proteins. 


T3T AAO ITA 1 O A1 C OAA*» 1A C1( CA C 


836 


PR00251 


T» A /*"vt«T">t» TAT /^kTiftrKT nTO\T A TT TO T"J 

BACTERIAL OPSIN SIGNATURE 


JrKUUzDlA Iz.O o.ozUe-lU jl-)-!>i4 \ 


836 


PR00807 


POLLEN ALLERGEN AMB FAMILY 
SIGNATURE 


PR00807A 16.64 8.151e-09 459-476 


838 


PR00019 


LEUCINE-RICH REPEAT SIGNATURE 


"DTJAAAIAA 11 1 A O /nc n 1 A OOT 1A A 

rKUUUlyA 11. 19 o.430e-lU 3Z/-.54U 
PR00019A 11.19 9.217e- 10 182-195 
ppnnnioA 11 101 x\*\c* no 97R 901 

PR00019B 11.36 3.520e-09 227-240 


841 


PF00023 


Auk repeat proteins. 


PF00023A 16.03 6.464e-09 135-150 


OA A 

844 


PD01270 


BCPCDTAD T?i~* TA /TK A~Y TXT/~\ (~2T ADT TT TKT 

AFFIN. 


PTViio7nr> oa ^ n%t* no 909 ^97 
jriJUiz/uu z*f.oo j.j/oe-uy zyz-oz/ 


OA A 

644 


13 T AA1/1A 


Receptor tyrosine kinase class HI proteins. 


or n0940R 9 J. 70 0 ROQp-OQ 1 55-178 




Dt> Anno a 


A/f AA/f "HOlV/f ATM QTOXTATTTPP ' 
IVLAJVl JJUlvlAifN oIOINAl UKJi 


PP 00090 A 18 17 5 77fip-19 75Q-777 
PR00020P 13 66 6 932e-10 832-843 


845 


PD01270 


RECEPTOR FC IMMUNOGLOBULIN 


PD01270D 24.66 5.378e-09 292-327 






jVLfVivj. aomain proteins. 


RT/10740A 1^ 87 8 313e-12 761-773 

DlvUv / *T vA. U.O/ O.JUv"lA« /VIA*"/ / *J 

BL00740B 19 76 8 500e-09 901-921 




pno9oao 


T-PFT T CrT VPOPROTFTN CT)R CHAIN 
SURFACE ALPHA PRE. 


PD02080B 20 69 9 621e-09 538-576 




00940 


P »r»p»-ntAT tvrAcirif* Hn<»c/* place TTT nrAtpinQ 

AVCt'CpiUl lyiUolllC ftJJlOOU tlOOO JJ.1 pllJl&lAlO. 


BL00240B 24 70 9 809e-09 155-178 


847 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 4.273e-09 839-852 


0/1*7 
54/ 


■DT*AfY70A 

rrUU/oU 


jjornain iounu in injjvi-iikc Kinases, mouse 

CllTUIl a LIU yCaol XvWIYa. 


PF00780T 14 69 4 825e-09 165-194 


0*tO 


PR 00360 


P? DOM ATM ^TGNATTTRF 


PR00360B 13 61 4 273e-09 88-101 


851 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 8.250e-12 174-197 


OJ 1 


r\]V/fAAi 7Q 
LI1VLUU1 /y 


u/TTTMAQP ATPWA A TTFTP^TOW T-PTh T T 
w JSJ.iN/\oii /\j^jrxx/\ AiJriiioivjiN i -v^hi^Ij. 


r>M00179 13 97 3 842e-10 218-227 

Aw/lVlV/V/ A / 7 1 J.7 / J.OtLC'lU Li(l' < LZif 


851 


PD02870 


RECEPTOR INTERLEUKIN- 1 
pp v ex tp <3op 


PD02870B 18.83 5.500e-10 327-359 


851 


PR00021 


SMALL PROLINE-RICH PROTEIN 

QTYTM A TT TP In 
olvJiN A 1 Uxvti 


PR00021A 4.31 8.405e-09 402-414 


852 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 8.250e-12 170-193 






TfTMA QF AT PT-TA A lTHTHQTfYNJ T-PFT T 


DM00179 13 97 3 842e-10 214-223 


852 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR 


PD02870B 18.83 5.500e-10 323-355 


852 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A4.31 8.405e-09 398-410 


854 


PF00168 


C2 domain proteins. 


PF00168C 27.49 2.636e-10 183-208 
PF00168C 27.49 6.318e-10 316-341 


854 


PR00399 


SYNAPTOTAGMIN SIGNATURE 


PR00399C 12.82 7.324e-12 216-231 
PR00399A 9.52 8.239e-ll 145-160 



WO 2004/080148 



PCT/US2003/030720 



315 

TABLE 3A 



ID 


Database 
entry ID 


Description 


Kesult* 








PPfW3QO"R 1/1 07 Q 1 1 1 /CA 1T1 
rJ\\j\JjyyD 14.Z/ o.Z//e-ll lOU-l/j 

PR00399D 14.48 3.930e-10 236-246 
ppfiiY^Qon i/i 97 1 01 no 901 1f\A 


oj4 


dp nmAfi 

rivUUjOU 


r*o nn\yrATM qt^tm atttpp 

IJvJJYL/VLIN olVJiNAl UKii 


Ppon^/^YR 1^ £i £ ao7#a 19 9nn 913 
PR00360A 14.59 6.538e-ll 304-316 

PP00160R 1** £1 R fv*£p-1 1 

PR00360A 14.59 2.184e-09 173-185 


R^ 


pr*01 71 q 
r UK) I / 1 y 


PP PPT TP QfYP m VPnPP nTFTNT QTOM A T 

RE. 


PD01710A 19 80 ^ dR^p 1£ ^79 


OJJ 


r>T aa 1 49 


iNcuixai zinu incidiiupcpiiud.ijCis, ^uio-uiiiuiiig 
region proteins. 


DJjuUI'tZ. O.JO / ,J*tJC"l 1 J07"J77 


8^ 
OJJ 


dp aa/ica 




PP004R0R 1^41 0 1 R9#» 1 ft ^R4 409 


857 


PR00833 


POLLEN ALLERGEN POA PI 
SIGNATURE 


PR00833H 2.30 3.077e-09 58-72 


O C7 

857 


PF0U930 


Drpeptiayl peptidase IV (Urr IVj JN-terminal 
region. 


DPAAQ1AT Q *7fi 1 AAA*» AC 0/£*7 OC7 

rrUUi/jUJ o./o l.UUUe-Uo ZO/-Z0/ 


858 


PR00833 


POLLEN ALLERGEN POA PI 
SIGNATURE 


PR00833H 2.30 3.077e-09 51-65 


858 


PF00930 


Dipeptidyl peptidase IV (DPP IV) N-tenninal 
region. 


PF00930J 8.78 1.000e-08 260-280 


859 


PR00258 


SPERACT RECEPTOR SIGNATURE 


PR00258A 11.46 8.054e- 16 333-349 
rKUO/JdrJ y.oj i.ouye-iz JDZ-JOJ 
PR00258E 13.33 1.833e-10 421-433 


859 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420B 22.67 7.582e-30 337-391 
BL00420C 11.909.100e-13 422-432 
BL00420A 20.42 8.269e-12 249-277 
BL00420A 20.42 7.382e-ll 264-292 
BL00420A 20.42 1.885e-10 288-316 
BL00420A 20.42 7.344e-10 246-274 
BL00420A 20.42 2.246e-09 261-289 


859 


BL01113 


Clq domain proteins. 


BL01113A 17.99 3.189e-13 264-290 
BL01113A 17.99 5.909e-ll 246-272 
BL01113A 17.99 1.383e-10 273-299 
BL01 1 13A 17.99 2.149e-10 258-284 
BL01113A 17.99 2.915e-10 261-287 
BL01113A 17.99 5.596e-10 252-278 
BL01 1 13A 17.99 7.128e-10 267-293 

nr A1 1 1 1 A il on 1 iTQI- AO OO 1AQ 

BLUiiUA i/.yy i.oyze-uy zoz-^uo 
BL01113A 17.99 5.154e-09 255-281 


860 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420B 22.67 8.333e-39 397-451 
BL00420C 11.90 9.100e-13 482-492 
BL00420A 20.42 9.135e-12 309-337 

RT 00490A 90 A9 7 "*R9p 1 1 ^94 ^59 

BL00420A 20.42 1.885e- 10 348-376 
BL00420A 20.42 7.639e-10 306-334 
BL00420A 20.42 2.246e-09 321-349 


860 


PR00258 


SPERACT RECEPTOR SIGNATURE 


PR00258A 11.46 8.054e-16 393-409 
PR00258B 9.63 1.509e-12 412-423 
PR00258E 13.33 1.833e- 10 481-493 
PR00258C 9.05 3.667e-09 427-437 


860 


BL01113 


Clq domain proteins. 


BL01113A 17.99 3.189e-13 324-350 
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SEQ 
ID 



Database 
entry ID 



Description 



Result* 



BL01113A 
BL01113A 
BL01113A 
BL01113A 
BL01113A 
BL01113A 
BL01113A 
BL01113A 



17.99 
17.99 
17.99 
17.99 
17.99 
17.99 
17.99 
17.99 



5.295e-ll 
1.383e-10 
2.149e-10 
2.915e-10 
7.128e-10 
1.692e-09 
4.115e-09 
5.673e-09 



306-332 
333-359 
318-344 
321-347 
327-353 
342-368 
312-338 
315-341 



862 



BL00028 



Zinc finger, C2H2 type, domain proteins. 



BL00028 
BL00028 
BL00028 
BL00028 
BL00028 
BL00028 
BL00028 
BL00028 
BL00028 
BL00028 
BL00028 



16.07 
16.07 
16.07 
16.07 
16.07 
16.07 
16.07 
16.07 
16.07 
16.07 
16.07 



1.450e-13 
1.000e-12 
8.435e-12 
1.346e-ll 
2.731e-ll 
2.731e-ll 
3.423e-ll 
3.423e-ll 
7.577e-ll 
1.600e-10 
9.400e-10 



222-238 
474-490 
502-518 
306-322 
362-378 
390-406 
250-266 
334-350 
418-434 
194-210 
278-294 



862 



PD00066 



PROTEIN ZINC-FINGER METAL-BINDL 



PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 



13.92 
13.92 
13.92 
13.92 
13.92 
13.92 
13.92 
13.92 
13.92 
13.92 
13.92 



8.200e-16 
7.231e-15 
7.923e-15 
4.600e-14 
5.200e-14 
1.000e-13 
l.000e-13 
3.000e-13 
5.304e-ll 
7.652e-ll 
7.000e-09 



322-334 
406-418 
462-474 
378-390 
490-502 
210-222 
294-306 
238-250 
266-278 
350-362 
434-446 



862 



PR00048 



C2H2-TYPE ZINC FINGER SIGNATURE 



PR00048A 10.52 7.545e-15 415-428 
PR00048A 10.52 2.929e-13 387-400 
PR00048A 10.52 6.786e-13 219-232 
PR00048A 10.52 8.714e-13 443-456 
PR00048A 10.52 2.059e-12 247-260 
PR00048A 10.52 2.059e-12 331-344 
PR00048A 10.52 5.235e-12 471-484 
PR00048A 10.52 9.471e-12 499-512 
PR00048B 6.02 2.385e-ll 319-328 
PR00048B 6.02 2.385e-ll 487-496 
PR00048A 10.52 9.053e-ll 303-316 
PR00048B 6.02 1.563e-10 375-384 
PR00048A 10.52 2.957e- 10 359-372 
PR00048A 10.52 3.348e-10 191-204 
PR00048B 6.02 8.313e-10 459-468 
PR00048A 10.52 9.217e-10 275-288 
PR00048B 6.02 9.438e-10 207-216 
PR00048B 6.02 1.947e-09 263-272 
PR00048B 6.02 3.368e-09 235-244 
PR00048B 6.02 3.368e-09 291-300 
PR00048B 6.02 7.158e-09 403-412 



863 



PD01234 



PROTEIN NUCLEAR BROMODOMAIN 



PD01234B 15.53 3.250e-09 568-585 
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SEQ 
ID 


Database 
entry ID 


Description 


Result* 






TRANS. 




865 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 1.257e- 10 225-239 
PR00320A 16.74 4.441e-10 225-239 


865 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 9.053e-09 227-237 


867 


BL00600 


Aminotransferases class-Ill pyridoxal- 
phosphate attachment si. 


BL00600E 16.43 1.771e-17 302-330 
BL00600A 17.98 3.880e-17 98-121 
BL00600G 12.43 9.625e-17 377-395 
BL00600B 19.60 5.091e-15 160-185 
BL00600F 8.77 2.421e-12 343-355 
BL00600C 16.18 6.040e- 12 190-205 
BL00600D 8.71 1.000e-10 281-294 


868 


BL00600 


Aminotransferases class-Hi pyridoxal- 
phosphate attachment si. 


BL00600E 16.43 1.771e-17 199-227 
BL00600G 12.43 9.625e-17 274-292 
BL00600B 19.60 2.703e-14 57-82 
BL00600F 8.77 2.421e-12 240-252 
BL00600C 16.18 6.040e-12 87-102 
BL00600D 8.71 1.000e-10 178-191 


869 


BL00021 


Kringle domain proteins. 


BL00021D 24.56 1.188e-24 248-289 
BL00021B 13.33 2.983e-13 88-105 


869 


BL00134 


Serine proteases, trypsin family, histidine 
proteins. 


BL00134C 13.45 8.800e-15 276-289 
BL00134A 11.96 9.438e-15 88-104 
BL00134B 15.99 3.676e-12 237-260 


869 


BL00495 


Apple domain proteins. 


BL00495O 13.75 8.597e-16 267-295 
BL00495N 11.04 2.235e-ll 229-263 
BL00495K 12.58 4.990e-10 90-122 


869 


PR00722 


CHYMOTRYPSIN SERINE PROTEASE 
FAMILY (SI) SIGNATURE 


PR00722C 10.87 3.571e-14 236-248 
PR00722A 12.27 5.966e-14 89-104 ! 
PR00722B 12.51 9.571e-10 145-159 


869 


BL01253 


Type I fibronectin domain proteins. 


BL01253H 13.15 3.609e-23 258-292 
BL01253G 11.34 4.103e-15 236-249 
BL01253D 4.84 4.360e-09 88-101 


870 


BL00188 


Biotin-requiring enzymes attachment site 
proteins. 


BL00188 30.29 9.122e-09 154-199 


873 


DM00758 


AGRIN. 


DM00758 13.12 6.459e-10 93-108 


873 


BL00612 


Osteonectin domain proteins. 


BL00612B 11.35 1.284e-09 86-118 


873 


DM00060 


338 kw NEUREXIN ALPHA IE CYSTEINE. 


DM00060 6.92 8.000e-ll 1048-1057 
DM00060 6.92 4.060e-09 128-137 


873 


BL01185 


C-terminal cystine knot proteins. 


BL01 185B 21.14 4.388e-09 234-282 


873 


PR00010 


TYPE n EGF-LKE SIGNATURE 


PROOOlOA 11.79 1.450e-12 46-57 
PROOOIOC 11.16 2.333e-ll 184-194 
PROOOIOC 11.16 9.333e-ll 296-306 
PROOOIOC 11.16 4.273e-10 66-76 
PROOOIOC 1 1.16 7.000e-10 28-38 
PROOOlOA 1 1.79 7.097e-10 488-499 
PROOOIOC 11.16 3.571e-09 546-556 
PROOOlOA 1 1.79 4.231e-09 564-575 
PROOOIOC 11.16 5.929e-09 374-384 


873 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764F 16.89 4.699e-10 52-72 
PR00764F 16.89 5.562e-10 170-190 
PR00764F 16.89 6.301e-10 321-341 
PR00764F 16.89 9.753e-10 360-380 



WO 2004/080148 



PCT/US2003/030720 



TABLE 3A 
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Database 
entry lu 




Result* 








PR00764F 16.89 2.052e-09 570-590 
PR00764F 16.89 2.636e-09 398-418 
PR00764F 16.89 7.312e-09 128-148 
PR00764F 16.89 7.662e-09 282-302 
PR00764F 16.89 7.662e-09 532-552 


873 


PR00011 


TYPE IE EGF-LKE SIGNATURE 


PR00011B 13.08 6,425e-09 63-81 
PR00011B 13.08 8.521e-09 25-43 


873 


BL00203 


Vertebrate metaUothioneins proteins. 


BL00203 13.94 8.531e-09 75-120 


873 


BL00022 


EGF-like domain proteins. 


BL00022B 7.54 1.000e-09 378-384 
BL00022A7.48 9.000e-09 173-179 
BL00022A 7.48 9.000e-09 363-369 


873 


BL00279 


Membrane attack complex components / 
perforin proteins. 


BL00279E 37.11 2.000e-13 553-600 
RT 0077QF T7 1 1 6 875e-13 343-390 
BL00279E37.il 6.803e-12 1031-1078 
BL00279E37.il 2.962e-ll 35-82 
BL00279E37.il 5.731e-ll 304-351 
BL00279E 37.11 7.115e-ll 73-120 
BL00279E37.il 7.462e-ll 515-562 
BL00279E 37.1 1 1.217e-10 265-312 
BL00279E37.il 4.349e-09 153-200 
BL00279E37.il 9. 163e-09 381-428 


873 


BL01187 


Calcium-binding EGF-like domain proteins 
pattern proteins. 


BL01187B 12.04 3.333e-12 541-556 
BL01187B 12.04 4.000e-12 179-194 
BL01187B 12.04 8.000e-12 291-306 
BL01187B 12.04 4.300e-ll 617-632 
BL01187B 12.04 7.900e-ll 407-422 
BL01187B 12.04 1.514e-10 23-38 
BL01187B 12.04 3.829e-10 369-384 
BL01187B 12 04 5 371e-10 503-518 
BL01187B 12.04 7.171e-10 137-152 
BL01187A 9.98 7.429e-10 486-497 
BL01187B 12.04 7.429e-10 61-76 
BL01187B 12.04 2.800e-09 1057-1072 
BL01187B 12.04 3.475e-09 579-594 
BL01187A 9.98 4.375e-09 44-55 
BL01187B 12.04 7.300e-09 255-270 
BL01187B 12.04 9.550e-09 330-345 


873 


PD00919 


CALCIUM-BINDING PRECURSOR 
SIGNAL R. 


PD00919A 11.53 8.820e- 10 280-291 
PD00919A 11.53 9.864e-09 568-579 


874 


PR00960 


LMBP PROTEIN SIGNATURE 


PR00960A 10.63 4.667e-09 78-93 


875 


BL00738 


S-adenosyl-L-homocysteine hydrolase 
proteins. 


BL00738J 18.61 1.000e-40 459-508 1 
BL00738H 23 08 5 320e-36 335-387 
BL00738F 12.23 7.261e-29 254-285 
BL00738A 16.27 9.660e-27 83-122 
BL00738C 16.53 7.923e-25 148-185 
BL00738G 14.29 6.268e-23 313-334 
BL00738B 12.28 8.085e-21 123-147 
BL00738E 14.18 9.200e-19 228-250 
BL007381 14.57 5. 135e- 17 412-449 
BL00738D 7.16 5.109e-13 202-216 


875 


BL00836 


Alanine dehydrogenase & pyridine nucleotide 
transhydrogenase. 


BL00836D 22.30 8.622e-09 291-327 
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Oil 


"DD HA/IK 


TIT? AnVTTTMTM ^TrrNATTJRE 


PR00425C 13 23 3 586e-09 426-445 


878 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 2.579e-24 181-217 
BL00514G 15.98 9.111e-12 324-353 
BL00514F 11.65 8.914e-09 271-285 
BI00S14T1 1 S ^5 9 SfiSe-OQ 222-234 


879 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 2.579e-24 181-217 
BL00514G 15 98 9 11 1e-12 324-353 
RT 00S14F 11 8 914e-0Q 271-285 
BL00514D 15 35 9 565e-09 222-234 


ooU 


rJJLUU D I*f 


JrlDIinOgvn DC Id. dllU galxlxlla uiiaixio 

fprminal rlnmnin nmtPiriQ 
ICliiiillal uuillaili piuvviiio. 


BL00514C 17 41 2 579e-24 181-217 i 
BL00514G 15.98 9.111e-12 324-353 
BL00514F 11.65 8.914e-09 271-285 
BL00514D 15.35 9.565e-09 222-234 


883 


BL00218 


Amino acid permeases proteins. 


BL00218D 21.49 7.446e-ll 244-288 
BL00218E 23.30 3,640e-10 325-364 


884 


BL00107 


Protein kinases ATP -binding region proteins. 


BL00107A 18.39 3.172e-ll 158-188 


OOJ 


rJLrUUOI J 


V^-iypC IdULJLll UUlIialil plUlClllo. 


BL00615A 16.68 6.538e-10 41-58 


889 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 4.900e-10 239-288 




T\TV>TAA1 HQ 

DMUU1 /y 


«/ iiTTMAQT? AT PTTA AnTTPQTnN T-PFT T, 
W JSJlNAoCf ALT ITLrY /\J^XjJDo±wrN A-V^X>J-fi/. 


DM00179 13 97 9 526e-10 118-127 


891 


PR00049 


WILMS TUMOUR PROTEIN SIGNATURE 


PR00049D 0.00 1.305e-09 155-169 
PR00049D 0.00 6.797e-09 156-170 


892 


BL00633 


Bromodomain proteins. 


BL00633B 13.82 5.950e-21 95-119 
RT 00633A 14 69 5 154e-14 74-86 
BL00633C 15.24 8.07 le- 14 421-433 
BL00633B 13.82 4.600e-13 388-412 


892 


DM00406 


GLIADIN. 


DM00406 7.73 5.135e-10 970-982 
DM00406 7 73 8 054e-10 753-765 


892 


PR00049 


WILM'S TUMOUR PROTEIN SIGNATURE 


PR00049D 0.00 8.866e-ll 755-769 
PR00049D 0.00 9.47 le- 11 756-770 
PR00040D 0 00 2 220e-09 748-762 

i IWU V" 7 1/ \Jm\J\J &%XfX*\J\J \J 1 ~\i I \J£* 

PR00049D 0.00 3.288e-09 972-986 


892 


UMUUzjU 


TUMOR. 


DM00250B 13 84 8 031e-ll 1009-1032 
DM00250A 10.52 6.607e-09 772-787 
DM00250B 13.84 7.568e-09 754-777 
DM00250B 13.84 7.689e-09 755-778 


892 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.31 3.734e-09 967-979 
PR00021A4.31 6.582e-09 771-783 
PR00021A4.31 7.722e-09 769-781 


892 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 

OT/^XT A TT TO t? 


PR00910A 2.51 7.750e-09 255-267 


892 


BL00415 


Synapsins proteins. 


BL00415N 4.29 3.231e-12 749-792 
BL00415N 4.29 6.504e-12 750-793 
BL00415N 4.29 4.857e-ll 748-791 
BL00415N 4.29 1.824e-10 1003-1046 
BL00415N 4.29 6.221e-10 1002-1045 
BL00415N 4.29 9.313e-10 964-1007 
BL00415N 4.29 2.314e-09 958-1001 
BL00415P 2.37 8.200e-09 747-782 


892 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 3.837e-10 966-984 
PR00209B 4.88 5.696e-10 968-986 
PR00209B 4.88 8.141e-10 752-770 
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PR00209B 4.88 8.594e-09 758-776 


892 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 5.340e-09 768-817 
BL00904A 8.30 9.489e-09 752-80.1 


892 


PD02059 


CORE POLYPROTEIN PROTEIN GAG 
CONTAINS: P. 


PD02059B 24.48 9.746e-09 867-901 


892 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.313e-12 750-782 
DM00215 19.43 7.000e-12 748-780 
DM00215 19.43 9.438e-12 754-786 
DM00215 19.43 7.000e-ll 749-781 
DM00215 19.43 8.412e-ll 752-784 
DM00215 19.43 1.161e-10 953-985 
DM00215 19.43 7.429e-10 948-980 

TYKiTfiAOl^ in 1 HO *7^1 *7C1 

DMUOZID ly.4J l.UUUe-Uy OwoJ 

DM00215 19.43 2.678e-09 759-791 
DM00215 19.43 3.441e-09 753-785 
DM00215 19.43 4.508e-09 240-272 
DM00215 19.43 4.661e-09 241-273 
DM00215 19.43 4.966e-09 765-797 
DM00215 19.43 6.492e-09 954-986 

DM00215 19.43 9.847e-09 747-779 


892 


PRO0503 


r>T» AH *Ar\ AX I A TXT OTAXT A TT TT> ~C 

BROMODOMAIN SIGNATURb 


ri\\)\)j\)jU zu. 01 i.*fuye-io hzi-hhu 
PR00503B 9.96 7.750e-18 94-110 
ppoo^oip 10 R4 1 790P-1S 1 10-128 

PR (10501 A 14 19 6 824e-13 78-91 
PR00503B 9.96 4.400e-12 387-403 
PR00503D 20.81 1.188e-ll 128-147 
PR00503C 19 84 1 000e-08 403-421 


894 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 2.397e-14 92-114 


894 


PR00290 


KAZAI^TYPE SERINE PROTEASE 

rMTJrDTTm? Q TfTM A TT TP Ti 
1IN rilol I LfJK. Ol VJlN J\ X U1\J} 


PR00290A 10.88 2.286e-ll 92-102 


894 


PR00450 


RECOVERIN FAMILY SIGNATURE 


PR00450C 12.22 4.532e-09 182-203 


895 


PR00753 


1 A X/lTM/^/^V/^T ODD HP A "MT7 1 

1 - AMJUN UL I KslAJr KUr AJNiS- 1 - 

CARBOXYLATE SYNTHASE 

ctaxt A TT TD T7 


PRn07S^F 8 01 8 522e-ll 171-195 


896 


BLUU4/8 


LIM domain proteins. 


T*T 00478R 14 79 4 000e-12 102-116 
BL00478B 14.79 6.000e-12 173-187 
BL00478B 14 79 6 200e-ll 43-57 
BL00478B 14 79 9 135e-10 231-245 


897 


PR00109 


TYROSINE KINASE CATALYTIC 

nflMATN ^TrtMATTITJF 
u wivi_r\i in oivjin/\ i uivn 


PR00109B 12.27 5.787e-13 467-485 


897 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479C 12.01 7.300e-13 512-524 


897 


BL00239 


Receptor tyrosine kinase class II proteins. 


BL00239B 25.15 8.948e-13 402-449 


897 


BL00107 


Protein kinases ATP-binding region proteins. 


BL00107A 18.39 9.217e-14 467-497 
BL00107B 13.31 8.714e-ll 533-548 


897 


PF00564 


Octicosapeptide repeat proteins. 


PF00564B 24.74 6.442e-09 418-468 


898 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 5.787e-13 654-672 


898 


BL00479 


Phorbol esters / diacylglycerol binding 


BL00479C 12.01 7.300e-13 699-711 
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898 


BL00239 


Receptor tyrosine kinase class II proteins. 


BL00239B 25.15 8.948e-13 589-636 


898 


BL00107 


Protein kinases ATP-binding region proteins. 


BL00107A 18.39 9.217e-14 654-684 
BL00107B 13.31 8.714e-ll 720-735 






wcucosapepuae repeat pruicuia. 


PF00564R 24 74 6 442e-09 605-655 


900 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007C 15.60 3.893e-18 199-220 
PR00007A 19.33 7.500e-17 124-150 
PR00007B 14.16 2.688e-16 151-170 
PR00007D 9.64 5.154e-ll 232-242 


900 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 6.400e-ll 77-105 
BL00420A 20.42 6.164e-10 25-53 
BL00420A 20.42 9.262e-10 68-96 
BL00420A 20.42 1.277e-09 65-93 


900 


BLOl 113 


Clq domain proteins. 


BL01113B 18.26 8.031e-28 130-165 
BL01113C 13.18 7.000e-18 199-218 

HT fi1 1 1 1 A 17 00 ^ 1 1 ^ 101 

BL01113D 7.47 7.231e-12 234-243 
PIT 01 1 1 ^ A 17 OQ % R64p-1 1 ^4-60 
BL01113A 17.99 1.191e-10 71-97 
BL01113A 17 99 1 957e-10 77-103 
BL01113A 17 99 1 000e-09 28-54 
BL01113A 17.99 5.154e-09 68-94 
BL01113A 17.99 7.577e-09 74-100 
BL01113A 17.99 8.615e-09 83-109 


y\j l 


PR00Q77 


ADFNTNF NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927A 7.98 9.667e-09 14-26 


902 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 4.494e-12 427-445 


902 


BL00415 


Synapsins proteins. 


BL00415N 4.29 6.771e-10 425-468 


902 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.31 3.278e-09 448-460 


902 


DM00406 


GLIADIN. 


DM00406 7.73 3.919e-10 427-439 
DM00406 7.73 6.400e-09 448-460 


902 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR0O208A 12.59 5.438e-09 402-419 
PR00208A 12.59 7.534e-09 420-437 
PRfMWORA 12 59 8 521e-09 419-436 






invoiucnn proteins. 


RTi>070SP 17 06 1 105e-10 396-440 
BL00795C 17 06 6 651e-10 411-455 
BL00795C 17.06 6.965e-10 394-438 
BL00795C 17.06 7.698e-10 422-466 
BL00795C 17.06 2.900e-09 408-452 
BL00795C 17.06 3.800e-09 395-439 
BL00795C 17.06 5.200e-09 425-469 
BL00795C 17.06 9.200e-09 424-468 


905 


PR00019 


LEUCINE-RICH REPEAT SIGNATURE 


PR00019A 11.19 8.435e-10 5-18 


908 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 3.250e-10 1480-1494 


908 


PR00457 


ANIMAL HAEM PEROXIDASE 
SIGNATURE 


PR00457E20.67 3.118e-22 1041-1067 
PR00457D 16.81 4.194e-21 1016-1036 
PR00457C 19.25 1.675e-13 998-1016 
PR00457H 15.90 5.680e-13 1292-1306 
PR00457F 13.69 4.750e- 12 1094-1104 
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PR00457G 17.45 8.615e-12 1221-1241 
PR00457B 13.29 3.411e-10 846-861 


908 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 1.000e-09 325-348 


908 


PD01270 


RECEPTOR FC IMMUNOGLOBULIN 
AFFIN. 


PD01270A 17.22 4.581e-09 304-343 


90S 


proooiq 


T FTirTNE-RTCH REPEAT SIGNATURE 


PR00019B 11 36 7 480e-09 73-86 

A &VVW X ✓ 1 * X X • \J f • 1 w w W \J \S 9 -mS W 




RT 01 908 


\/~\KJT*C* y Hnmain nrntPinQ 

V Wf LIU 11 Id 11.1 £JlVJlvlXXa. 


BL01208B 15 83 3 250e-10 1511-1525 


909 


PR00457 


ANIMAL HAEM PEROXIDASE 


PR00457E 20.67 3.118e-22 1072-1098 
PR00457D 16 81 4 194e-21 1047-1067 

x xvv/v/ »»/ # x^ Xw.cjx ~. i. ✓ ■ v x ivn/ xw/ 

PR00457C 19 25 1.675e-13 1029-1047 
PR00457H 15.90 5.680e-13 1323-1337 
PR00457F 13.69 4.750e-12 1 125-1 135 1 
PR00457G 17.45 8.615e-12 1252-1272 
PR00457B 13.29 3.411e-10 877-892 


909 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 1.000e-09 356-379 


909 


PD01270 


RECEPTOR FC IMMUNOGLOBULIN 
AFFIN. 


PD01270A 17.22 4.581e-09 335-374 


y\jy 


PR00O10 


T FTTPTTsIF-RTPW RFPFAT STGTsf ATI TRF 


PR00019B 11 36 7 480e-09 104-117 

X JLWvv IJrXI X X fJ\J I i~Uvv \J *J X wa XX/ 




RT 01 ?OR 


\/"\X/FP Hnmnin r\rr\tpinQ 
v vvrv viuxxiaixi jjxuivixxd* 


BL01208B 15 83 3 250e-10 1373-1387 


910 


PR00457 


ANIMAL HAEM PEROXIDASE 
STGNATTIttF 


PR00457E 20.67 3.118e-22 934-960 
PR00457D 16 81 4 194e-21 909-929 
PR00457C 19.25 1.675e-13 891-909 
PR00457H 15.90 5.680e-13 1185-1199 
PR00457F 13.69 4.750e-12 987-997 
PR00457G 17.45 8.615e-12 1114-1134 
PR00457B 13.29 3.41 le-10 739-754 


910 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 1.000e-09 302-325 


910 


PD01270 


RECEPTOR FC IMMUNOGLOBULIN 
AFFIN. 


PD01270A 17.22 7.677e-09 281-320 


910 


PR00019 


LEUCINE-RICH REPEAT SIGNATURE 


PR00019B 11.36 8.920e-09 73-86 


911 


BL00022 


EGF-like domain proteins. 


BL00022B 7.54 3.250e-10 881-887 
BL00022B 7.54 1.000e-09 88-94 


01 1 

y 1 1 


PP007£4 




PR00764F 1 6 89 8 274e-10 942-962 
PR00764F 16.89 6.377e-09 576-596 


01 1 

y I i 


PP0001 0 


TVPF TT FfiF T TK"F ^TPNTATT I"R F 


PR0001AA 1 1 79 3 700p-12 43-54 
PR00010C 11.16 5.636e-10 84-94 
PROOOIOC 11.16 6.727e-10 122-132 
PROOOlOA 11 79 8 258e-10 168-179 
PROOOlOA 11.79 1.231e-09 102-113 
PROOOIOC 11.16 5.500e-09 877-887 
PROOOIOC 11.16 7.000e-09 230-240 


911 


DM00060 


338 kw NEUREXIN ALPHA m CYSTEINE. 


DM00060 6.92 7.250e-ll 942-951 
DM00060 6.92 8.740e-09 576-585 


911 


BL00279 


Membrane attack complex components / 
perforin proteins. 


BL00279E 37.1 1 1.000e-10 925-972 
BL00279E 37.11 4.470e-10 846-893 
BL00279E 37.11 8.744e-09 559-606 


911 


BL01187 


Calcium-binding EGF-like domain proteins 
pattern proteins. 


BL01187B 12.04 9.667e-12 117-132 
BL01 187A 9.98 9.053e-l 1 166-177 
BL01187B 12.04 6.175e-09 834-849 
BL01 187A 9.98 8.125e-09 41-52 
BL01187B 12.04 9.325e-09 183-198 



WO 2004/080148 



PCT/US2003/030720 



323 

TABLE 3A 



SEQ 
ID 


Database 
entry ID 


Description 


Result* 


911 


PD00919 


CALCIUM-BINDING PRECURSOR 
SIGNAL R. 


PD00919A 11.53 9.410e-10 574-585 
PD00919A 11.53 9.864e-09 47-58 


914 


BL00888 


Cyclic nucleotide-binding domain proteins. 


BL00888B 14.79 4.000e-16 161-184 
BL00888B 14.79 1.692e- 14 279-302 


914 


DM01513 


CAMP-DEPENDENT PROTEIN KINASE 
REGULATORY CHAIN. 


DM01513B 6.81 8.457e-34 198-249 
DM01513B 6.81 2.500e-l4 322-373 


914 


PR00103 


CAMP-DEPENDENT PROTEIN KINASE 
SIGNATURE 


PR00103B 13.39 1.000e-16 173-187 
PR00103A 9.59 8.l05e-l5 276-290 
PR00103E 17.80 9.591e-15 355-367 
PR00103D 10.83 3.700e-14 334-345 
PR00103B 13.39 5.935e-13 291-305 
PR00103A 9.59 1.500e-12 158-172 
PR00103C 15.68 1.000e-ll 322-331 
PR00103D 10.83 4.349e-10 210-221 


915 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 8.920e-10 602-615 


916 


PR00087 


LIPOXYGENASE SIGNATURE 


PR00087C 15.00 3.057e-21 373-393 
PR00087A 18.37 7.955e-18 335-352 
PR00087B 15.25 l.OOOe- 16 353-370 


916 


BL00711 


Lipoxygenases iron-binding region proteins. 


BL00711E 19.66 8.909e-35 364-400 
BL007111 18.56 4.250e-34 526-563 
BL00711D 17.56 2.800e-24 296-321 
BL00711H 23.34 5.091e-23 484-522 
BL00711C 20.75 2.227e-21 221-249 
BL00711F 19.79 5.065e-16 434-450 

TIT A AT 1 i T*» 1 A *\ A t AftA» 1 f 1 £lf\ 1TC 

BL00711B 14.24 1.290e-15 160-175 
BL00711G 21.83 8.636e- 12 452-483 
BL00711A 15.87 5.645e-ll 94-103 


916 


PR00467 


MAMMALIAN LIPOXYGENASE 

<"1T/"V^T A till TT"4 T"» 

SIGNATURE 


PR00467F 11.25 4.661e-18 418-440 

TIT* f\f\ A riT A A A C CAA- 1 *7 AA1 1 1 A 

PR00467E 9.00 5.500e-17 293-312 
PR00467A8.044.000e-13 11-28 
PR00467D 16.69 5.210e-12 196-217 
rK0U4o/]3 1/.2D 1.8Jle-ll 3/-/0 
PR00467C 12.06 1.662e-09 134-148 


917 


PR00467 


MAMMALIAN LIPOXYGENASE 
SIGNATURE 


PR00467E 9.00 5.500e-17 266-285 
rK004o7A 8.04 4.UUUe-13 11-28 
PR00467D 16.69 5.210e-12 169-190 
PR00467B 17.25 1.831e-ll 57-76 


917 


BL00711 


Lipoxygenases iron-binding region proteins. 


BL00711C 20.75 2.227e-21 194-222 
BL00711B 14.24 1.290e-15 131-146 

T*»T AA*71 1 A If 0*7 C £ A C ~ 1 t A A 1 A1 

BL00711A 15.87 5.645e-ll 94-103 


918 


BL00711 


Lipoxygenases iron-binding region proteins. 


BL0071 1C 20.75 2.227e-21 223-251 

rjT (\(Y71 IT* 1 A OA 1 OOfl*» 1 <» 1 ^A_n^ 

jt5JLrUU/iiJ3 i.zyue-o lOU-l/D 
BL00711A 15.87 5.645e-ll 94-103 


918 


PR00467 


MAMMALIAN LIPOXYGENASE 
SIGNATURE 


PR00467E 9.00 5.500e-17 295-314 
PR00467A 8.04 4.000e-13 11-28 
PR00467D 16.69 5.210e-12 198-219 
PR00467B 17.25 1.831e-ll 57-76 
PR00467C 12.06 1.662e-09 134-148 


927 


PD00919 


CALCIUM-BINDING PRECURSOR 
SIGNAL R. 


PD00919A 11.53 8.377e-10 216-227 
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nesiiii 


927 




v^aioiuiu-oiHUiiig jjfVjr-iiKc uuuidiii proioma 

pal LCi 11 piulClIlo. 


TVTftl 1R7R 19 CiA 7 x!9Qa 1H 1 0R 191 
RTi11 1 R7R 1 9 HA 0 ARAp 10 1 80-904 
BL01187B 12 04 2 800e-09 227-242 


927 


PR00011 


TYPE ITT EGF-LIKE SIGNATURE 

XXX JL/ XXX JL/vJX X-/XXVX-/ uiVJl^/Vl UAVL/ 


PR00011D 14 03 4 I58e-12 39-57 i 

X AWUW X 11/ 1T<V/J i « 1 JUt/ 1 Z. -J J U 1 

PR00011B 13 08 2 973e-09 39-57 


927 


BL00243 


Integrins beta chain cysteine-rich domain 
nroteins 


BL00243H 17.53 7.276e-09 65-90 


927 


PR00010 


TYPE n EGF-LIKE SIGNATURE 


PR00010C 11.16 5.929e-09 194-204 
PR00010C 11 16 8 286e-09 1 13-123 


927 


BL01185 


C-terminal cystine knot proteins. 


BL01185B 21.14 9.047e-09 168-216 


927 


DM00060 


338 kw NEUREXIN ALPHA m CYSTEINE. 


DM00060 6.92 9.460e-09 139-148 


927 


BL01248 


Laminin-type EGF-like (LE) domain proteins. 


BL01248 1 1.02 9.660e-09 48-60 


9? J? 




PTRnQHTVyfAT PPOTRTW P9 QTH>JA r TTrRP 
XsJLDKJ&\JwLi\Lt rl\\J i JCiJLLN r Z olOXNA 1 UxvJB 




933 


BL00680 


Methionine aminopeptidase subfamily 1 
proicins* 


BL00680 14.37 5.304e-17 173-194 


933 


BL01202 


Methionine aminopeptidase subfamily 2 
proteins. 


BL01202B 26.24 9.671e-10 173-210 


933 


PR00599 


METHIONINE AMINOPEPTIDASE. 1 

OT/TIVT A TT TT> TJ 

oivrlNA 1 UKfc 


PR00599B 12.01 4.600e-20 173-189 

DDAA^OOA 11 1 1/1 1<1 t /C/f 

rKUUDyyA 11. Oj l.Z/Je-14 1M-1(>4 

PR00599D 12.92 3.340e-10 273-285 

PPOfKOOr 1 1 1 "XA 6 Alt* no 9/11 9^5 






rtSXJ xCllN ofu JJwlVx/vliN IvCFllrVl 


pnnn9RO o 07 4 o£n*» i n i 17 1 <c\ 

jTJJUUZoif y.y i H-.yOUe-lU 13/-1jU 


040 


PD02784 


PR OTFTN NT TCT FAR 
RIBONUCLEOPROTEIN. 


PD097R4R 96 46 1 OOOp-40 9 17-9 S9 
ri7v/A/o , TD ^u.^u i,vvUC-*tV/ vc 1 /-z.jy 

PD02784C 20.76 1.000e-40 335-380 
PD02784A 21 09 4 176e-36 178-214 
PD02784B 26.46 7.683e-10 370-412 


940 


BL00030 


Eukarvotic RNA-bindinff region RNP-1 
proteins. 


BL00030A 14 39 1 857e-09 456-474 
BL00030A 14 39 1 000e-08 186-204 


941 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7. 188e-12 410-422 


941 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 9.816e-12 408-426 


941 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907B 11.29 4.082e-ll 144-160 


941 


PF00094 


von Willebrand factor type D domain 

nrr\fp i n o 
piULClilb. 


PF00094A 11.09 5.109e-09 139-148 






lJJ.lCgl.LLLo UCld Clid. HI Oyo ICIIIC-1 1C11 UUllldlll 

L/l VI Iv XI lO . 




941 


RL01177 

oxjkj 11// 


AnanTivlflfrivin HnmaiTi nmtfMnc 

S\lXa^JXXj ICLllJAALL \X\JLllaLLL ^JL KJ\\sLl±a. 


RI 01 177F 00 (A 9 RR?p-09 14fi-17? 


941 


BL01187 


Calcium-binding EGF-like domain proteins 

JJCllldll LJXUl&XlXO. 


BL01187B 12.04 9.100e-14 237-252 
RT 01 1 87R 1 7 04 S 1 97-707 

xjXvVllO/xj l^.Ut J,JjJw"1Zi 1-/Z.-Z.V// 

BL01187B 12.04 6.333e-12 110-125 
RT 01 1 87A 9 98 9 2S0e-09 17^.184 
BL01187A 9 98 1 000e-08 218-229 

ft/ftJV X X U t X*. «/ m^Kj X iUvV/v v/V/ A< X U ^iL7 


942 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.188e-12 415-427 


942 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 9.816e-12 413-431 


942 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907B 11.29 4.082e-ll 149-165 


942 


PF00094 


von Willebrand factor type D domain 
proteins. 


PF00094A 11.09 5.109e-09 144-153 


942 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL00243H 17.53 7.632e-09 74-99 


942 


BL01177 


Anaphylatoxin domain proteins. 


BL01177E 20.64 9.882e-09 151-177 


942 


BL01187 


Calcium-binding EGF-like domain proteins 


BL01187B 12.04 9. 100e-14 242-257 
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ID 
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pntrv 1 11 

CUU J JX/ 


Description 


Result* 






pattern proteins. 


BL01187B 12.04 5.333e-12 197-212 
RT 01 1R7R 19 nzt £ *\X\e> 19 1 1^-1 %C\ 

BL01187A 9 98 9 ?50e-09 178-189 

BL01187A 9 98 1 000e-08 223-234 

J-#^/Vr X X %J f X X *S mS \J X * W W W W JWJW^J Xrf*^~ 


943 


PF00855 


PWWP domain proteins. 


PF00855 13.75 8.403e-13 274-290 


943 


BL00633 


RrnmnHomain nrnteins 


BL00633B 13 8? 8 977e-12 178-202 


943 


BL00479 


Phorbol esters / diacylglycerol binding 

fl Amain nt" orpine 


BL00479B 12.57 9.460e-10 94-109 


943 


PR005O3 


BROMODOMAIN SIGNATURE 


PR00503B 9 96 R 667e-10 177-103 
PR0O503D 20 81 9 069e-09 21 1-230 


944 


PF00855 


PWWP domain proteins. 


PF00855 13.75 8.403e-13 274-290 


944 

7 t 1 


RT ,00633 


i-Jl VJlllWClvJlllUllI UlULt/LiiO. 


RT006^3R 1^ 89 8 977p-19 178-909 


944 


BL00479 


Phorbol esters / diacylglycerol binding 

/inmjiin fimtpinc 

\XL#AxiaiXi UlUtClil2>. 


BL00479B 12.57 9.460e-10 94-109 


944 


PR00503 




PROfKfilR 0 Ofi 8 fifi7p-10 177-10^ 

PR00503D 20.81 9.069e-09 211-230 


945 


PF00855 


P\JVWP domain nrnfpin<; 


PF00855 13 75 8 403e-13 274-290 


945 


BL00633 


Rrnmnfiomain nrofpin^ 

XJL \JLlX\J\X\JXX\XCiXxx LJ1 1711/lliO. 


BL00633R 13 82 8 977e-12 178-209 


945 


BL00479 


PVinrhnl p<?tpr^ / Hiarvlolvrprnl hinHino 

domain proteins. 


RT 00479R 12 V7 9 460e-10 94-109 


945 


PR00208 


GLTADTN AND T MW GT TJTFNTN 

VJ 1 Jlil 1 /X_L > ^VL^IXy XjIVX TV KJxjVj 1 X>Xllil 

SUPERFAMILY SIGNATURE 


PR00208A 12 59 9 868e-10 83S-8S2 
PR00208A 12.59 2.233e-09 838-855 


945 


DM00406 


GLIADIN 


DM00406 7 73 9 000e-09 836-848 

X-/XVX\/V/^V#\1 1 . 1 -J J »\J\J\J\s \J ^ \J^J\J 


945 


PR00503 


BROMODOMAIN SIGNATURE 

xJxvv^ivxvyxxwiTxrvxi^ uivjiin a wx\_iv 


PR00503B 9 96 8 667e-10 177-193 
PR00503D 20.81 9.069e-09 211-230 


946 


PF00855 

X X \J\IiJtJ J 


PAAMA/P Hnmain nrntpinQ 

X VV VV X UUlXmiXl LjXiFLwXAXO. 


PF00855 13 75 8 403e-13 279-295 


946 


BL00633 


Rrornofiomain Tirotpin<5 


BL00633B 13 82 8 977e~12 183-207 


946 


BL00479 

XJ X^V/V i 1 y 


PTiorhnl ft<5tp.TS / Hiacvlol vrprnl HinHincx 
x uuiuui \sO w>x. o / vxiav/jrigtjrvvivjA umumg 

domain proteins. 


RL00479B 12 57 9 460e-10 99-1 14 


946 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 9.868e-10 840-857 
PR00208A 12.59 2.233e-09 843-860 


946 


DM00406 


GLIADIN. 


DM00406 7.73 9.000e-09 841-853 


946 


PR00503 


BROMODOMAIN SIGNATURE 


PR00503B 9 96 8 667e-10 182-198 
PR00503D 20.81 9.069e-09 216-235 


950 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907B 1 1 29 4 039e-10 677-693 


950 


PR00206 

JL JL. V. w V JW w \J 


CONNEXIN SIGNATURE 


PR00206F 16 77 4 250e-09 498-521 


950 


PR00169 


POTASSIUM CliANNEL SIGNATURE 

X. Vi/ X. X lWk/&\>iTl X^/X XXIX, * X ^ JLsX./ L/1VJ1 lXT», JL * J 


PR00169G 9 39 7 932e-09 467-489 


951 


BL00427 


Disintetrrins oroteins 


BL00427 13 93 7 592e-26 443-497 


951 


PR00138 


MATRJLXIN SIGNATURE 


PR00138D 16 56 5 101e-ll 342-367 


951 


BL00142 


Neutral zinc metallopeptidases, zinc-binding 
region nroteins 


BL00142 8.38 7.545e-ll 342-352 


951 


PR00289 


DISINTEGRIN SIGNATURE 


PR00289A 13.62 2.500e-14 457-476 
PR00289B 11.79 4.226e-10 486-498 


951 


PR00480 


ASTACIN FAMILY SIGNATURE 


PR00480B 15.41 8.909e-10 337-355 


951 


BL00546 


Matrixins cysteine switch. 


BL00546C 16.41 4.255e-09 336-367 


951 


BL00024 


Hemopexin domain proteins. 


BL00024D 17.28 5.596e-09 336-367 


951 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907E 11.70 7.353e-09 629-651 


953 


PD00078 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR. 


PD00078B 13.14 5.500e-ll 360-372 


953 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 6.000e-12 334-349 
PF00023A 16.03 1.857e-ll 156-171 
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ppnnnoiA i£ ai 1 "iAia, 1 1 o^< ota 

rr\J\J\JZ,DJ\ IO.Uj o.l*loe-ll Zjj-Z/U 

PF00023B 14.20 3.455e-09 363-372 


953 


PF00791 


Domain present in ZO- 1 and Unc5-like netrin 
receptors. 


PF00791B 28.49 4.273e-ll 334-388 
PF00791B 28.49 4.818e-ll 301-355 

PFnfi7Q1T* OR AQ A RzK** 1H 1 RR 940 
fTUU/yiJD Zo.*ty *f.o £ rDe-lU 100-Z4Z 

PF00791B 28.49 9.339e-09 222-276 




BLUUZjz 


Interferon alpha, beta and delta family 
proteins. 


TAT/VYJ^OA Ifi 4Q A 01 1^ 71 ' 
iJ-LA/UZOZA lo.W O.OD/e-Zo 

BL00252B 19.78 2.846e-14 73-123 


OKA 

yj4 


rKUUzoo 


TMTT3t> IM7P fYNT AT DTJA AKTTl A 

SUBUNIT SIGNATURE 


Ppnno^AA ii <i 1 nnoo 11 A7 70 

JrivUUZOOA lo.Ol l.UUUe-lJ Oi-fy 


956 


PR00081 


GLUCOSE/RIBITOL DEHYDROGENASE 
FAMILY SIGNATURE 


PR00081A 10.53 6.226e-13 34-51 
PR00081F 15.71 7.632e-12 152-172 
pp atiari n i n ir o ro^ 0 i n i hr 110 

JrJtvUUUolo IU.00 Z.o;Oe-iU lUo-liy 


958 


PR00885 


BACTERIAL GENERAL SECRETION 
PATHWAY PROTEIN H SIGNATURE 


PR00885B 8.16 9.143e-10 394-408 


958 


BL00616 


Histidine acid phosphatases phosphohistidine ■ 
proteins. 


BL00616A 11.86 7.81 le-09 40-47 


959 


BL00284 


Serpins proteins. 


BL00284C 28.56 1.000e-34 118-159 
BL00284D 16.34 4.857e-21 224-250 
BL00284B 17.99 5.800e-19 91-111 
BL00284E 19.15 7.577e-18 305-329 


960 


BL00284 


Serpins proteins. 


BL0O284C 28.56 2.588e-23 180-227 
BL00284A 15.64 7.750e-22 73-96 
BL00284D 16.34 4.857e-21 292-318 
BL00284E 19.15 7.577e-18 373-397 


961 


TIT ftAOO/( 

BL00284 


Serpins proteins. 


BL00284C 28.56 1.000e-34 1 86-227 

T>T AAOC/I A 1 < </1 O 7«no OO 71 Q£ 

uluuzoha io.o4 /./oue-zz /j-yo 

■RT nnORAFI 1A1/l/iRS7f» 01 7QO 11 R 

rJLrUUZo'fjj io.jh- *f.oj /e-zi z^z-jio 
BL00284B 17.99 6.625e-18 159-179 

Ttt 00,9 84P 10 1^7 ^77p-1 R 373-1Q7 


962 


BL00284 


Serpins proteins. 


BL00284C 28.56 1.000e-34 204-245 
"RT 009 R4 A 1 S 64 7 7S0p-9? 7^-96 

BL00284B 17.99 5.800e-19 177-197 
BL00284E 19.15 7.577e-18 373-397 


964 


BL00427 


Disintegrins proteins. 


BL00427 13.93 2.739e-16 459-513 


yon 




ACTA PTM V A lVTTT V QTOMATTTPP 




964 


BL00142 


Neutral zinc metallopeptidases, zinc-binding 
region proteins. 


BL00142 8.38 1.429e-09 364-374 


964 


PR00289 


DISINTEGRIN SIGNATURE 


PR00289A 13.62 7.000e-14 473-492 
rKUUzoyjo 1 1. /y z.j /ye-uy juzo 14 


964 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 3.966e-ll 763-813 
xki c\(\a\ on i £ 7 n/^i» 1 n 7^0 roq 

BL00412D 16.54 4.857e-09 764-814 
BL00412D 16.54 9.357e-09 762-812 


966 


BL01238 


GDA1/CD39 family of nucleoside 
phosphatases proteins. 


BL01238C 14.36 2.l74e-17 177-198 
BL01238D 10.19 3.302e-13 216-229 
BL01238A 11.72 6.936e-12 59-73 
BL01238B 10.99 1.529e-09 133-143 


967 


BL01113 


Clq domain proteins. 


BL01113B 18.26 9.438e-20 95-130 
BL01113D 7.47 9.308e-12 195-204 
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Result* 








BL01113C 13.18 4.750e-10 163-182 


967 


DP AAA AT 


rwiv/fPT PiwrPXFT P1 0 DOMAIN 
SIGNATURE 


PR00007B 14.16 7.698e-13 116-135 
PR00007D9.64 9.654e-ll 193-203 
PR00007C 15.60 3.656e-10 163-184 
PR00007A 19.33 1.571e-09 89-115 


yOy 


rKUU/j / 


RRnnnp«;rN-T TKF GPCR SUPERFAMILY 
SIGNATURE 


PR00237A 1 1.48 5.355e-09 408-432 


970 


BL00290 


Imniunoglobuliiis and major 
histocompatibility complex proteins. 


BL00290A 20.89 7,480e-10 160-182 
BL00290B 13.17 2.875e-09 226-243 


970 


PR00939 


C2HC-TYPE ZINC-FINGER SIGNATURE 


PR00939B 13.27 8.412e-09 532-540 


971 


BL00289 


Pentaxin family proteins. 


BL00289D 17.60 1.947e-31 409-447 
BL00289C 12 56 8 615e-16 370-388 
BL00289A 30 36 7 457e-14 282-312 
BL00289B 15.96 8.364e-12 327-341 


971 


TYI> AAOAC 

PR00o95 




PR00895E 12.74 5.065e-18 417-436 
PR00895D 14.28 3.769e-17 397-416 
PR00895C 12.29 4.273e-17 370-388 
PR00895A 14.53 8.826e-13 305-319 
PR00895B 14.20 2.154e-12 327-341 
PR00895F 15.41 1.439e-10 436-450 


972 


PF00992 


Troponin. 


PF00992A 16.67 6.447e-09 741-775 


973 


BL00036 


bZIP transcription factors basic domain 
proteins. 


BL00036 9.02 5.737e-l 1 633-645 


973 


T1T> AAA/1 "2 


QTfTM ATT TPF 


PR00043B 8.73 9.241e-l l 633-649 


973 


PF00624 


Flocculin repeat proteins. 


PF006241 9.10 5.125e-10 461-490 
PF006241 9.10 5.800e-10 462-491 
PF006241 9.10 4.331e-09 458-487 
PF006241 9.10 6.457e-09 456-485 
PF006241 9. 10 6.81 le-09 453-482 
PF006241 9.10 8.441e-09 454-483 


977 


PR00048 


C2H2-TYPE ZINC FINGER SIGNATURE 


PR00048A 10.52 2.174e-10 2473-2486 


977 


DM00406 


GLIADLN, 


DM00406 7.73 1.400e-09 537-549 


977 


t>"d a a ao 1 


cA/f ATT PP OT TNJR.P. TPH PR OTFTN 
SIGNATURE 


PR00021A 4.31 2.253e-09 538-550 


977 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 2.660e-09 537-586 


977 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 5.821e-10 543-575 
DM00215 19.43 7.750e-10 531-563 
DM00215 19.43 7.750e-10 559-591 1 
DM00215 19 43 2 525e-09 536-568 
DM00215 19.43 4.508e-09 533-565 


077 
y t i 


PR00049 


WELM'S TUMOUR PROTEIN SIGNATURE 


PR00049D 0.00 9.017e-ll 540-554 
PR00049D 0.00 9.168e-ll 541-555 
PR00049D 0.00 2.983e-09 538-552 
PR00049D 0.00 3.288e-09 539-553 
PR00049D 0.00 3.898e-09 543-557 
PR00049D 0.00 4.814e-09 537-551 
PR00049D 0.00 6.034e-09 191-205 


977 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.318e-09 542-553 
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9/ / 


"DT nCkAl ^ 


Qimancina tytatpitiq 
oyilapbllib piuicuib. 


BL00415N 4.29 8.143e-ll 556-599 
BL00415N4.29 8.357e-ll 550-593 
BL00415N 4.29 6.702e-10 543-586 
BL00415N 4.29 8.145e-10 532-575 
BL00415N4.29 8.969e-10 548-591 
BL00415N 4.29 3.562e-09 555-598 
BL00415N 4.29 4.088e-09 531-574 
BL00415N4.29 9.869e-09 539-582 


ATT 

977 


PR002 1 1 


ryr T TTFT TW ATTTRF 

\jL*\J 1 JDl>iiN OlVJIN/V 1 UISJ3> 


PR00211B 0.86 9.917e-09 551-571 


AOA 

980 


TIT AAOOO 

BL00282 


JvaZal SGIlXlc piUlCdoO UxUxUlluia xaixxiijr 

proteins. 


BL00282 16.88 4.234e-12 73-95 


AQ A 

9oU 


PR0U8J4 


A /TYEnn PP OTP A 9F FAMILY 
QTfTNATTrRF 

OlVJlN-rV 1 Ux\J_> 


PR00834C 15.43 3.613e-20 237-261 ! 
PR00834D 12.14 6.455e-18 275-292 
PR00834B 10.09 5.500e-14 196-216 
PR00834E 13.63 5.355e-13 297-314 
PR00834F 10.91 9.526e-12 389-401 
PR00834A9.80 3.659e-ll 175-187 


980 


BL00222 


Insulin-like growth factor binding proteins. 


BL00222B 11.09 4.420e-10 22-37 


980 


PR00290 


KAZAL-TYPE SERINE PROTEASE 

TNTHTRTTOP STfiNATTJRE 


PR00290B 9.78 4.326e-09 84-95 


9oU 


X5JLUUZ / j 


TTpat-QtaKIp PTiterntnxiTi*? nroteins 


BL00273 12.24 8.286e-09 26-38 


981 


PR00792 


PEPSIN (Al) ASPART1C PROTEASE 
FAMILY SIGNATURE 


PR00792A 11.54 5.500e-18 80-100 
PR00792D 12.74 9.069e-13 395-410 
PR00792C 9.10 4.214e-12 312-323 


981 


BL00141 


Eukaryotic and viral aspartyl proteases 
proteins. 


BL00141A 12.10 4.789e-15 87-102 
BL00141E 14.32 6.850e-15 396-419 
BL00141D 6.28 7.300e-ll 312-321 
BL00141B 12.14 2.929e-10 228-239 


982 


BL00523 


Sulfatases proteins. 


BL00523A 13.36 6.65 le- 10 44-60 


984 


PR00765 


CARBOXYPEPTIDASE A 

A/TFTAT T OPROTFASF (M14i FAMILY 

iVxxi 1 A\ 1 j\ A xv w 1 CrVOXi ^ivx i. * j a .rvivxxx-f x 

SIGNATURE 


PR00765B 15.57 7.857e-16 99-1 13 
PR00765D 14.16 5.500e-U 233-246 
PR00765C 12.55 1.290e-10 179-187 


984 


BL00132 


Zinc carboxypeptidases, zinc-binding region 
l pioicins. 


BL00132C 21.35 3.308e-28 129-169 
BL00132B 15.93 1.871e-16 99-112 
BL00132A 26.07 1.682e-14 50-90 
BL00132F 13.26 7.254e-14 228-249 
BL00132D 12.70 2.875e-12 173-187 
BL00132E 17.72 3.552e-12 199-225 
BL00132G 10.94 4.541e-10 285-302 


985 


PR00765 


CARBOXYPEPTIDASE A 

A/TT?T AT T r>PR OTP A<?P CM 14 ^ FAMILY 
QTOMATTTRF 


PR00765B 15.57 7.857e-16 99-113 
PR00765D 14.16 5.500e-ll 233-246 
PR00765C 12.55 1.290e-10 179-187 


985 


BL00132 


Zinc carboxypeptidases, zinc-binding region 
1 proteins. 


BL00132C 21.35 3.308e-28 129-169 
BL00132B 15.93 1.871e- 16 99-112 
BL00132A 26.07 1.682e-14 50-90 
BL00132F 13.26 7.254e- 14 228-249 
BL00132D 12.70 2.875e-12 173-187 
BL00132E 17.72 3.552e-12 199-225 
BL00132G 10.94 4.541e-10 285-302 


990 


PD00066 


PROTEIN ZINC-FINGER METAL-BINDI. 


PD00066 13.92 5.304e-ll 110-122 


991 


BL00107 


Protein kinases ATP-binding region proteins. 


BL00107A 18.39 1.000e-15 139-169 
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BL00107B 13.31 4.273e-13 209-224 


991 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12 27 7 894e-13 139-157 


991 


BL00240 


Receptor tyrosine kinase class EI proteins. 


BL00240E 11.56 6.580e-10 125-162 


994 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007A 19.33 6.936e-13 168-194 
PR00007C 15.60 9.250e-13 243-264 
PR00007B 14 16 9 372e-13 195-214 
PR00007D 9.64 5.500e-ll 275-285 


994 


PR00524 


CHOLECYSTOKININ TYPE A RECEPTOR 
SIGNATURE 


PR00524F 5.36 1.766e-09 94-107 


994 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 7.058e-12 79-107 
BL00420A 20.42 4.689e-10 97-125 
BL00420A 20.42 6.902e-10 82-110 
BL00420A 20.42 1.277e-09 85-113 
BL00420A 20.42 5.292e-09 76-104 


994 


BLOl 113 


Clq domain proteins. 


BL01113B 18.26 1.675e-24 174-209 
BL01113A 17.99 1.871e-15 85-111 
BL01113A 17.99 5.091e-14 82-108 
BLOl 1 13D 7.47 3.250e-13 277-286 
BL01113A 17.99 4.892e-13 76-102 
■RT.01111A 17 OQ d IORp-13 04-170 

BL01113A 17.99 9.757e-13 79-105 
BL01113A 17.99 3.769e-12 88-114 
BL01113A 17.99 6.308e-12 91-117 
BL01113C 13.18 9.294e-12 243-262 
BL01113A 17 99 8 159e-ll 70-96 

XJxJ\J XXX JX» X / U< 1 J 7w XX / \1 s\l 

BL01113A 17.99 9.795e-ll 97-123 
BL01113A 17 99 9 809e-10 73-99 
BL01113A 17.99 6.019e-09 103-129 


995 


DM01595 


kw ALLANTOICASE SPAC1F7.09C. 


DM01595D 10.94 8.269e-16 116-140 I 
DM015951 8.91 2.714e-15 300-317 
DM015951 8.91 9.727e-14 117-134 
DM01595D 10.94 3.274e-l 1 299-323 
DM01595E 14.67 6.299e-09 152-184 


997 


BL00720 


Guanine-nucleotide dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e-18 1089-1112 


997 


BL00741 


Guanine-nucleotide dissociation stimulators 
CDC24 family sign. 


BL00741B 14 27 4.326e-16 377-399 


1001 


BL00048 


Protamine PI proteins. 


BL00048 6.39 6.684e-10 949-975 
BL00048 6.39 3.363e-09 947-973 
BL00048 6.39 9.888e-09 781-807 


1002 


PF00628 


PHD-fineer 

X X X M.S X Ulgwl • 


PF00628 15 84 8 412e-14 201-21 5 

X X \J\J\Jx*sj x *J * wi O. « 1L6 X » xWJ x 4* x*J 


1002 


BL00048 


Protamine PI proteins. 


BL00048 6.39 6.684e-10 1158-1184 
BL00048 6.39 3.363e-09 1156-1182 
BL00048 6.39 9.888e-09 990-1016 


1003 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320A 16.74 4.103e-ll 1132-1146 
PR00320C 13.01 8.200e-10 1132-1146 
PR00320A 16.74 9.735e-10 1091-1105 
PR0032OC 13.01 2.500e-09 1091-1105 
PR00320B 12.19 6.625e-09 1132-1146 


1004 


PF00569 


Zinc finger present in dystrophin, CBP/p300. 


PF00569 13.42 1.545e- 16 21-37 
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1004 


PD00306 


PROTEIN GLYCOPROTEIN PRECURSOR 
RE. 


PD00306A 10.26 2.929e-09 257-270 


1006 


PR00399 


SYNAPTOTAGMIN SIGNATURE 


PR00399A 9 5? 1 9f?4e-0<) 1 fi?-1 77 


1007 


PR00806 


VINCULLN SIGNATURE 


PR00806D 11.95 3.963e-09 564-579 


1008 




A TY1\/1 r\irlr\rrpnip o-1x/r*rvrYrnfi»fn f»Ytriifrp»11iilQT 

domain proteins. 


TVT.0031QP 17 n ^ n9^p in ^oa 
BL00319C 17.12 4.316e-09 563-596 
BL00319C 17.12 5.382e-09 560-593' 


1008 

L\J\JO 


PF00Q99 


\# pci oil! fwi i mo ti n n n rmr Af a in 

v csicuiovirus pnospnoproiein. 


PT700Q99A ID 17 fl 6£9c» OO ^91 £f\A 

rruuyzzA iy. 1 / o.ooze-uv o / i-0U4 


1009 


PR00405 


HIV REV INTERACTING PROTEIN 
SIGNATURE 


PR00405B 11.83 8.385e-15 281-298 
PR00405A 17.71 4.306e-14 262-281 


100Q 


PP004S9 


QT-H nniUATM QTOTMATTTRP 
OnJ i^wlVJLrVliN OlVJl>c/\ 1 vJlvD 


PPOOA^9Ti 1 1 /C^ ^ <AAo HO 0O< qia 


1000 


PP OOQ 1 0 


T TTTPnVTRTTQ OPT?/* PPHTTBTM 
LU 1 tSU VJUtvUo VJlvrO rJvU I HUN 

<sT(TNJATTTPF 


T>pnnoi/YA 9 <1 o a7/; q none a/n 1 


1011 


BT 00240 


Rprpntnr tvrncinA Unacp r*iJicc TTT TM*r\r^inc 


HT 00940"R OA 70 9 £7A*» 10 IRA A(\l 

BL00240B 24.70 8.535e-10 479-502 
BL00240B 24.70 7.702e-09 575-598 


1011 


PD02870 


RECEPTOR INTERLEUKIN- 1 


PD02870B 18.83 4.600e-10 617-649 

PTJ09R70R 1 R 8^ ^ BR'** 00 9R £0 

PD02870B 18.83 7.894e-09 225-257 


1015 


BL00018 


EF-hand calcium-binding domain proteins. 


BL00018 7.41 5.765e-ll 147-159 


1015 


PR00450 


RECOVERIN FAMILY SIGNATURE 


PR00450C 12.22 1.228e-09 33-54 


1015 


BL00303 


S-100/ICaBP type calcium binding protein. 


BL00303B 26.15 6.559e-09 26-62 


1018 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 1.474e-24 136-175 
BL00237C 13.19 6.400e-14 289-315 
BL00237B 5.28 3.077e-12 244-255 
BLU0237D 11.23 9.654e-ll 342-358 


1018 


PR00237 


RHODOPSIN-LIKE GPCR SUPERFAMILY 

C 1YTM A TT TO T7 


PR00237E 13.03 2.588e-16 236-259 

"D"D AAIITH O A/1 O OAA« 1 vl 1 OiC 1A*7 

FR00237D 8.94 8.800e-14 186-207 
PR00237B 13.50 2.636e-13 105-126 
risSj\)£5 Ik, ID.Oy 4.yoUe-lo 1jU-1 /Z 
PR00237F 13.57 6.040e-13 294-318 
PR00237A 11.48 3.143e-12 72-96 
PR00237G 19.63 3.531e-12 332-358 
PR00237E 13.03 4.441e-09 234-257 


1018 


PR00238 


OPSIN SIGNATURE 


PR00238B 16.24 2.667e-14 208-220 ! 
PR00238A 13.79 8.286e-09 93-105 


1018 


PR00667 


RETINAL PIGMENT EPITHELIUM- 
RETINAL GPCR SIGNATURE 


PR00667B 10.86 8.800e-09 91-106 




rKUUUiy 


LbUClNE-RICrl REPEAT SIGNATURE 


TVCk AAA1A A 1 1 1 A C Cf\f\~ 1 C 0*70 1 1 

PR00019A 11.19 5.500e-15 378-391 
PR00019A 11.19 3.739e-10 134-147 

T1DAAA1 ATI 11 1 i\f\f\~ AA fir C AO 

FR00019B 11.36 L000e-09 535-548 
PR00019B 1 1.36 2.440e-09 375-388 | 

PR00010A 1 1 1Q ^ ^V-OQ 9<\9 9^S 

PR00019B 1 1 .36 4.960e-09 225-238 
PR00019A 11.19 7.000e-09 560-573 
PR00019B 1 1.36 7.840e-09 351-364 
PR00019B 11.36 9.640e-09 180-193 


1021 


BL00720 


Guanine-nucleotide dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 6.595e-15 996-1019 


1021 


PF00791 


Domain present in ZO-1 and Unc5-like netrin 
receptors. 


PF00791C 20.98 6.01 le- 12 606-644 
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OTTO 

ED 


Database 
entry ID 


i/cacripnuu 


fVcSUll 


J.UZ1 




PRESYNA. 


PD009R9 Q Q7 <\ 1 695-fvJR 


1 no 1 


PP A AC 7/1 


T-TTT? A /T\Vtnr\ PP HTR A QT? T7AMTT V 

SIGNATURE 


PP00$n4F 10 01 9 QA&f> 00 fi9l-fiH 


1A9 1 


DT AABQfi 


cyclic nucieouQC-uiiiuiTig umiiaiii pruiciiiD. 


P»T .OORRRR 14 70 4 6R9p-0Q 155-T7R 


lUZZ 


rJLUU /ZU 


Ouaiunc-iiucicoiiae aissociaiion. sumuiaiors 

loJJv^ZJ lallUiy Mgll. 


PIT 00790R \6 ^7 5QSp-1 S Q46-Q6Q 


1UZZ 


PT7rtH70 1 


1 vrimaiTi pi Co C III ill ju\J~ x dxiu. uiiu J liivc jj.cu.lll 


PF007Q1C 70 OR 6 Oil e- 12 556-594 


1099 

1UZZ 


pnnnoRO 

JrJL'UUZOJf 


PROTFTN SFR DOMAIN RFPFAT 
PRESYNA. 


PD00289 9 97 5 050e-ll 575-588 3 


1022 


PR00834 


HTRA/DEGQ PROTEASE FAMILY 
SIGNATURE 


PR00834F 10.91 2.946e-09 571-583 


1022 


BL00888 


Cyclic nucleotide-binding domain proteins. 


BL00888B 14.79 4.682e-09 305-328 


1 r\o/i 
1UZ4 


dt AA/n<; 


Fatty acid desaturases family 1 proteins. 




1024 


PR00669 


INHBBIN ALPHA CHAIN SIGNATURE 


PR00669B 8.27 6.488e-09 204-220 




BLO04/O 


Fatty acid desaturases family 1 proteins. 


PiT 0Ozl7/\P} 1 74 *a 49H<a OO 797 770 


1025 


PR00669 


ENHEBIN ALPHA CHAIN SIGNATURE 


PR00669B 8.27 6.488e-09 166-182 


1028 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 9.419e-36 133-180 
BL00232B 32.79 5.345e-21 242-289 
BL00232A 27.72 3.727e-20 39-71 

DT AAOIO/^ 1 A 9 7/lOa 1A 9/1A 9^7 

dLXjuZjZk, IU.Oj Z. /4Ze-14 Z4U-Z3/ 

BL00232B 32.79 6.566e-14 357-404 


i no o 

1028 


PR00205 


i~\ a r\UDDT\T OTPXT A TT TDT7 

CADHbRIN SIGNATURE 


"DDAAOA-CT5 1 1 lO 7 QnOo 1 ^ 7/1A 9-\7 

rKUUZlDJo 1 i,3y Z.^Uye-lD Z4U-ZD / 
PP0090^A 14 77H4 r 7i» 11 1^ IRA 


1029 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 9.419e-36 133-180 

P.T 009^91} ^9 70 *^ 14-^p 91 949 9RO 
RTi109^9A 97 79 ^ 797^-90 ^0-71 

BL00232C 10.65 2.742e-14 240-257 
BL00232B 32.79 6.566e-14 357-404 


1029 


PR00205 


CADHERJN SIGNATURE 


PR00205B 11.39 2.909e-15 240-257 
PR00205A 14.73 8.457e-ll 165-180 


1030 


PF00816 


H-NS histone family. 


PF00816B 13.849.284e-09 102-131 


1030 


PR00124 


ATP SYNTHASE C SUBUNIT 
SIGNATURE 


PR00124A8.81 9.000e-10 41-60 
PR00124A 8.81 9.379e-09 43-62 


1030 


BL00604 


Synaptophysin / synaptoporin proteins. 


BL00604F 5.96 9.696e-09 41-85 


1031 


BL00869 


Renal dipeptidase proteins. 


BL00869C 12.58 3.172e-19 112-147 
BL00869E 13.12 9.129e-18 173-209 

DT AAO/CAT 1 C iCA iC Alia 1 *7 lOI Oi^O 

BLUU869J 15.60 o.0ize-17 3z3-3oZ 
BL00869H 11.08 1.840e- 16 272-294 
BL00869G 13.55 2.543e-16 245-266 
BL00869F 12.77 7.03 le- 14 210-244 

RT OOR^OT 19 09 ^ 974<»-19 90 r 199 
DLAJKJOxjyL L&.yZ. J.Z/HG-1Z zyj-jzz 

BL00869D 14.02 5.282e-10 148-176 
BL00869B 15.55 9.382e-10 84-113 


1032 


BL00218 


Amino acid permeases proteins. 


BL00218D 21.49 7.446e-ll 244-288 
BL0O218E 23.30 3.640e-10 325-364 


1033 


BL00721 


Formate-tetrahydrofolate ligase proteins. 


BL00721B 13.21 1.000e-40 456-510 
BL00721D 13.90 1.000e-40 648-701 
BL00721E 13.46 1.000e-40 707-755 
BL007211 18.79 2.500e-40 924-969 
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i3LUU/Zlri Zl.ZU o.ziye-iy o/J-yZ3 
BL00721A 15.31 9.7 19e-32 397-430 

JDJLUU/ZIL^ lO.yz H.UUUeoU OUo-04*t 

BL00721F 15.96 8.232e-27 770-811 
i3.LrUU/zivj /.y/ o.ui/e-iu 1-0^3 


1033 


PR00085 


TETRAHYDROFOLATE 

D.bii Y L/KUCjciN AoJo/U Y CLUJtl Y JJKULrAoii 

FAMILY SIGNATURE 


PR00085C 15.23 4.906e-15 169-190 
ppnnnR^T* 1 <c 07 7 ^sr^ in 1 o/c 1 £i 

PR00085E 15.79 6.216e-09 266-295 


1033 


BL00415 


Synapsins proteins. 


ht nnAi <;xr a oo q ziooo no 1 c £i 


1035 


PR00834 


HTRA/DEGQ PROTEASE FAMILY 

OT/-1\T A TT TT> "C 

MCjJNAlUKii 


PPnAQl/lT? 1AQ1 1 QA6t% AO OO OA 


i no c 
1035 


BL00741 


vjuanine-nucleotiue dissociation snmuiators 
CDC24 family sign. 


t>t nn7din 14 77 o q^7a no oi 1 on 
j5i/Uu/*tii> i*f.z/ z.yoze-uy y ii-yjj 


1035 


PR00049 


WELM'S TUMOUR PROTEIN SIGNATURE 


PR00049D 0.00 4.814e-09 1125-1139 
PR00049D 0.00 5.729e-09 147-161 


1035 


Tin f\f\tZC A 

PR00554 


ADENOSINE A2B K±!,Ubr I UK 

blVJJN A 1 UKJb 


ppnn<;<wiT* 17 57 c s^a no 77 a txo 

rssSAjJj'tD lZ.Dz o.ojDe-Uy /Z*f-/oZ 


1 ATI 

1037 


PR00390 


rjTjrACDTJr/^T TD A OT7 Z" 1 CTnKT A TT TOT? 

r JriUorHULlr AoJb v> olLrJN A 1 UKjd 


ppnnionA 1^ no 1 a%q*> on 00^ 111 
£ rvUUjyu/v ij.uy l.^oye-zu zjo-oi^ 


1037 


BL00303 


S-100/ICaBP type calcium binding protein. 


BL00303B 26.15 4.971e-09 135-171 


1037 


BL00292 


Cyclins proteins. 


TiTnnoooA oo 87 ^ 11 no oon o^i 
x>JL#uuzyz/A. zz.o/ j.ii*te-ui/ zzu-zjj 


1039 


PR00245 


OLFACTORY RECEPTOR SIGNATURE 


PR00245B 10.38 5.82le-14 176-190 
ppnno/i^ a 1 8 m £ roi a 1 4 7Q 

x*KUUZ4DA 0.07lc-l*t J0-/7 

PR00245E 12.40 6.170e-ll 290-304 
PR00245C 7.84 2.286e-10 237-252 


1039 


BL002J7 


G-protein coupled receptors proteins. 


TIT nn017 A 77 f& 5 HQ RO 1 OR 
O.L/U UZ j / /\ Z/.OO J.*tUOC-l/y oy-IZO 


1039 


PR00896 


VASOPRESSIN RECEPTOR SIGNATURE 


PR00896B 9.01 7.577e-09 54-65 


1039 


PR00534 


MELANOCORTIN RECEPTOR FAMILY 
SIGNATURE 


PR00534A 11.49 8.586e-09 50-62 


1039 


PR00237 


RHODOPSIN-LIKE GPCR SUPERFAMILY 
SIGNATURE 


PR00237B 13.50 6.000e-09 58-79 
PR00237E 13.03 8.941e-09 198-221 


1040 


BL01187 


Calcium-binding EGF-like domain proteins 
pattern proteins. 


BL01187A 9.98 2.125e-12 233-244 
BL01187A 9.98 4.789e-l 1 286-297 
BL01187B 12.04 3.057e- 10 348-363 


1040 


PD00919 


CALCIUM-BINDING PRECURSOR 
SIGNAL R. 


PD00919D 17.80 l.OOOe-40 406-456 
PD00919D 17.80 l.OOOe-40 465-515 
PD00919G 15.92 l.OOOe-40 590-633 
PD00919H 17.48 l.OOOe-40 634-675 
PD009191 18.44 l.OOOe-40 676-724 
PD00919J 16.09 l.OOOe-40 737-775 
PD00919K 18.26 l.OOOe-40 776-810 
PD00919L 16.90 l.OOOe-40 812-851 
PD00919C 12.28 9.250e-34 357-386 

T\T\f\nC\ 1 OTT 11 til 1 AAA,. IO CCC CC1 

PD00919r 11.63 I.WK-55 ojo-joo 
PD00919E 11.16 1.000e-32 521-549 
PD00919G 15.92 4.197e-23 453-496 
PD00919G 15.92 1.556e-20 394-437 
PD00919F 11.63 5.103e-20 399-427 
PD00919G 15.92 9.111e-20 550-593 
PD00919D 17.80 3.793e-19 526-576 
PD00919F 11.63 8.397e-18 458-486 
PD00919B 9.47 3.455e-17 308-322 
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PD00919D 17.80 6.967e-17 566-616 
PD00919A 11.53 3.520e-15 199-210 
PD00919F 11.63 6.000e-15 595-623 
PD00919D 17.80 3.970e- 14 488-538 
PD00919D 17.80 8.1 10e-14 429-479 
PD00919F 11.63 3.379e-13 517-545 
PD00919G 15.92 4.757e-12 489-532 
PD00919D 17.80 6.094e-12 370-420 
PD00919D 17.80 9.915e-12 562-612 
PD00919E 11.16 2.517e-ll 403-431 
PD00919B 9.47 3.714e-ll 215-229 
PD00919G 15.92 7.224e-ll 512-555 
PD00919F 11.63 8.372e-ll 494-522 
PD00919E 11.16 8.382e-ll 498-526 
PD00919E 11.16 9.899e-ll 462-490 
rDuuyiyti n.io /.oo3e-iu ooy-jo/ 
PD00919D 17.80 9.061e-10 501-551 
PD00919E 11.16 1.092e-09 599-627 
PD00919D 17.80 1.525e-09 503-553 
PD00919G 15.92 3.638e-09 430-473 

PD00919D 17.80 6.625e-09 524-574 
pnnnoioA 11 n & ioi^mq 9^o-?sn 

ru\jvjy 17A ii.jj 0. / a /C"V/7 aj7"aju 
PDOfiQI OD 17 80 (s 77^e-09 442-492 


1042 


BL01022 


PTR2 family proton/oligopeptide symporters 
proteins. 


BL01022B 22.19 2.241e-15 74-119 
BL01022E 23.51 3.739e-14 440-475 
BL01022A 11.58 2.212e-12 44-62 
BL01022D 9.42 2.946e-12 195-207 
RT 010990 1fi 62 6 226e-10 160-183 


1042 


PR00308 


TYPE I ANTIFREEZE PROTEIN 


PR00308C 3.83 2.169e-09 20-29 


1043 


PF01140 


Matrix protein (MA), pl5. 


PF01140D 15.54 3.700e-10 977-1011 


1043 


DM00215 


PROLJNE-RICH PROTEIN 3. 


DM00215 19.43 5.018e-10 542-574 
DM00215 19.43 8.322e-09 537-569 
DM00215 19.43 8.322e-09 541-573 
DM00215 19.43 8.627e-09 530-562 
DM00215 19.43 9.542e-09 540-572 


1044 


PD01066 


PROTEIN ZINC FINGER ZINC-FINGER 
METAL-BINDING NU. 


PD01066 19.43 9.727e-36 10-48 


1044 


PD00066 


PROTEIN ZINC-FINGER METAL-BINDI. 


PD00066 13.92 3.769e-15 384-396 
PD00066 13.92 4.462e-15 244-256 
PD00066 13.92 6.538e-15 468-480 
PD00066 13.92 l.OOOe-13 300-312 
PD000fi6 13 92 1 000e-13 60R-fi9O 
PD00066 13.92 9.000e-13 160-172 
PD00066 13.92 3.571e-12 216-228 
PD00066 13.92 4.000e-12 580-592 
PD00066 13.92 5.714e-12 496-508 
PD00066 13.92 2.957e-ll 524-536 
PD00066 13.92 7.652e-ll 328-340 
PD00066 13.92 2.385e-10 552-564 
PD00066 13.92 1.600e-09 272-284 
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1044 


PR00048 


C2H2-TYPE ZINC FINGER SIGNATURE 


PR00048A 10.52 2.636e-15 589-602 
PR00048A 10.52 4.273e-15 253-266 
PR00048A 10.52 5.500e- 14 533-546 
PR00048A 10.52 4.214e-13 225-238 
PR00048A 10.52 5.765e-12 281-294 
PR00048A 10.52 7.882e-12 477-490 
PR00048A 10.52 1.474e-ll 169-182 
PR00048A 10.52 1.947e-l 1 141-154 
PR00048A 10.52 3.368e-l 1 309-322 
PR00048A 10.52 8. 105e-l 1 561-574 
PR00048A 10.52 9.526e-ll 393-406 
PR00048B 6.02 1.000e-10 297-306 
PR00048B 6.02 1.563e-10 577-586 
PR00048B 6.02 3.250e-10 353-362 
PR00048B 6 02 3 250e-10 409-418 
PR00048B 6.02 3.250e-10 437-446 
PR00048A 10.52 4.522e-10 617-630 
PR00048B 6.02 4.938e-10 241-250 
PR00048B 6.02 7.750e-10 493-502 
PR00048B 6.02 8.875e-10 381-390 
PR00048B 6.02 8.875e-10 465-474 
PR00048A 10.52 2.440e-09 197-210 
PR00048B 6.02 4.789e-09 605-614 


1044 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 7.429e-16 536-552 
BL00028 16.07 2.125e-15 592-608 
BL00028 16.07 4.938e-15 256-272 
BL00028 16.07 5.950e-13 228-244 
BL00028 16.07 1.000e-ll 452-468 
BL00028 16.07 2.731e-ll 396-412 
BL00028 16.07 4.1 15e-ll 172-188 
BL00028 16.07 5.154e-U 284-300 
BL00028 16.07 5.846e-ll 480-496 
BL00028 16.07 6.538e-ll 564-580 
BL00028 16.07 9.654e-ll 620-636 
BL00028 16.07 1.300e-10 144-160 
BL00028 16.07 1.900e-10 340-356 
BL00028 16.07 1.900e-10 424-440 
BL00028 16.07 9.100e-10 116-132 
BL00028 16.07 9.100e-10 200-216 
BL00028 16.07 9.700e-10 368-384 
BL00028 16.07 5.629e-09 508-524 
BL00028 16.07 7.943e-09 312-328 


1046 


PD01795 


PROTEIN AMINOPEPTIDASE 
PRECURSOR HYDROLASE SIGNA. 


PD01795A 10.27 6.667e-09 362-370 


1049 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 7.840e-09 43-55 


1049 


PR00766 


AMELORIDE-SENSnTVE AMINE 
OXIDASE SIGNATURE 


PR00766G 1 1.62 9.905e-09 91-111 


1050 


BL00211 


ABC transporters family proteins. 


BL00211B 13.37 7.429e-20 141-172 


1051 


PF00569 


Zinc finger present in dystrophin, CBP/p300. 


PF00569 13.42 1.545e-16 21-37 


1051 


PD003O6 


PROTEIN GLYCOPROTEIN PRECURSOR 
RE. 


PD00306A 10.26 2.929e-09 257-270 
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1052 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM0003 IB 15.41 5.500e-12 77-110 


1052 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290A 20.89 9.100e-12 154-176 


1053 


PR00018 


KRINGLE DOMAIN SIGNATURE 


PR00018A 14.52 3.423e-09 36-51 


1054 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688B 15.06 4.504e-09 85-132 
DM01688J 14.69 8.364e-09 32-68 


1055 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290B 13.17 4.000e-21 281-298 
BL00290A 20 89 4 600e-16 34-56 
BL00290A 20.89 4.375e-15 224-246 


1064 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9 97 1 000e-09 453-466 
PD00289 9.97 5.034e-09 47-60 PD00289 
9.97 5.034e-09 258-271 


1064 


PF00595 


PDZ domain proteins (Also known as DHR 
or GLGF). 


PF0O595 13.40 9.250e-10 450-460 
PF00595 13.40 7.000e-09 255-265 


1067 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.471e-12 126-144 


1067 


BL00107 


Protein kinases ATP -binding region proteins. 


BL00107A 18.39 2.800e-22 126-156 
BL00107B 13.31 6.786e-ll 196-211 


1067 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479C 12.01 3.000e-09 174-186 


1067 


BL00790 


Receptor tyrosine kinase class V proteins. 


BL00790M 8.74 4.857e-09 117-138 


1068 


PR00179 


LIPOCALIN SIGNATURE 


PR00179B 9.56 1.000e-12 120-132 
PR0O179C 19.02 1.000e-10 148-163 
PR00179A 13.78 5.680e-10 37-49 


1068 


BL00213 


Lipocalin proteins. 


BL00213B 8.78 8.000e-10 120-130 
BL00213A 12.95 9.526e-10 37-50 


1070 


PR00200 


ANNEXIN TYPE IV SIGNATURE 


PR00200G 9.43 5.602e-17 299-325 
PR0O200F 10 00 6 lfiOe-lfi 136-157 
PR00200E 10.00 3.012e-13 295-316 
PR00200F 13.72 6.157e-13 219-245 
PR00200E 10.00 4.742e-12 64-85 
PR00200B 7.39 9.063e-12 69-91 
PR00200G943 1991e-ll 140-166 
PR00200D 10.01 5.304e-ll 109-125 
PR00200H 13.68 5.050e-10 343-356 
PR00200B 7.39 2.865e-09 141-163 


1070 


PR00202 


ANNEXIN TYPE VI SIGNATURE 


PR00202G 8.01 1.563e-14 299-325 
PR00202E 13.00 9.613e-13 219-245 
PR00202D 5.58 8.636e-ll 136-157 
PR00202G 8.01 2.525e-09 140-166 
PR00202D 5.58 3.560e-09 64-85 






AWTvJFVTW TVPP7 TTT QTnisJATTTRT? 


PRflfllQO'R Ifi 10 7 3R7<»-1R 910 

PR00199D 5.65 1.409e-16 295-316 
PR00199G 9.09 6.354e-16 300-325 
PR00199D 5.65 6.455e-16 136-157 
PR0O199D 5.65 1.474e-13 64-85 
PR00199B 6.86 2.346e-10 69-91 
PR00199B 6.86 5.458e-10 300-322 
PR00199B 6.86 8.234e-10 141-163 
PR0O199C 13.84 6.464e-09 109-125 


1070 


PR00197 


ANNEXIN TYPE I SIGNATURE 


PR00197D 7.50 5.629e-16 136-157 
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PR00197F 9.03 7.395e-15 299-319 
PR00197D7.50 1.234e-14 295-316 
PR00197E 11.89 3.541e-13 219-245 
PR00197D 7 50 6 379e-ll 64-85 
PR00197B 7.56 7.124e-09 69-91 


1070 


PR00198 


ANNEXIN TYPE H SIGNATURE 


PR00198D 7.65 2.222e-15 136-157 
PR00198D 7.65 3.647e-13 295-316 
PR00198G 8.09 4.375e-13 299-319 
PR00198D 7.65 9.165e-10 64-85 
PR00198B 8.71 7.529e-09 69-91 
PR00198C 14.32 7.900e-09 109-125 
PR00198G 8 09 8 125e-09 140-160 


1070 


BL00223 


Annexins repeat proteins domain proteins. 


BL00223C 24.79 l.OOOe-40 278-332 
BL00223B 28.47 9.679e-39 201-250 
BL00223A 15.59 1.000e-27 132-165 
BL00223A 15.59 6.936e-22 60-93 
BL00223C 24.79 3.077e-17 119-173 
BL00223A 15.59 4.194e-16 291-324 
BL00223C 24.79 2.514e-09 47-101 
BL00223B 28.47 8.533e-09 117-166 


1070 


PR00201 

* . 


ANNEXIN TYPE V SIGNATURE 


PR00201G 11.02 7.692e-19 299-325 
PR00201D 10.49 1.656e-U 136-157 
PR00201 A 6 05 6 242e-l 1 69-9 1 
PR00201E 12.37 8.040e-ll 219-245 
PR00201C 11.13 3.897e-10 109-125 
PR00201D 10.49 5.050e-10 64-85 
PR00201G 11.02 6.215e-10 140-166 
PR00201D 10.49 9.910e-10 295-316 
PR00201A 6.05 4.297e-09 300-322 
PR00201H 12.04 7.506e-09 343-356 
PR00201A 6.05 8.842e-09 141-163 


1070 


PR00196 


ANNEXIN FAMILY SIGNATURE 


PR00196D 21.86 2.895e-21 219-245 
PR00196E 9.19 3.077e-20 299-319 
PR00196C 10.36 5.500e-20 136-157 
PR00196A 11.16 7.632e-19 69-91 
PR00196C 10.36 1.500e-15 295-316 
PTC001Q6B 10 68 8 875e-15 109-125 
PR00196C 10.36 8.071e-14 64-85 
PR00196A 11.16 2.714e-12 141-163 
PR00196G 11.72 4.250e-12 343-356 
PR00196E 9.19 9.735e-12 140-160 
PR00196F 13.89 1.000e-ll 327-342 
PR00196A 11.16 8.859e-10 300-322 
PR00196F 13.89 7.938e-09 168-183 
PR00196D 21.86 9.775e-09 135-161 


1071 


BL00610 


Sodiununeurotransmitter symporter family 
proteins. 


BL00610A 17.73 l.OOOe-40 52-101 
BL00610B 23.65 l.OOOe-40 115-164 
BL00610C 12.94 l.OOOe-40 212-263 
BL00610E 20.34 l.OOOe-40 372-414 
BL00610F 29.02 l.OOOe-40 469-523 
BL00610G 12.89 9.217e-22 528-550 
BL00610D 20.97 4.822e-19 278-330 
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1071 


PR00176 


SODIUM/NEUROTRANSMITTER 
SYMPORTER SIGNATURE 


PR00176A 16.82 1.529e-26 52-73 
PR00176C 10.84 5.500e-25 124-150 
PR00176G 12.48 2.688e-22 458-478 
PR00176E 11.41 2.000e-21 322-342 
PR00176F 10.73 3.333e-20 376-395 
PR00176B 7.31 1.600e-19 81-100 
PR00176D9.02 1.321e-18 239-256 
PR00176H 15.27 2.440e-18 498-518 


1072 


DM00179 


w KINASE ALPHA ADHESION T-CELL. 


DM00179 13.97 7.652e-09 113-122 


1073 


BL01207 


Glypicans proteins. 


BL01207C 19.08 6.538e-31 250-285 
BL01207B 23.69 9.122e-28 191-236 
BL01207D 23.23 1.692e-24 429-463 
BL01207A 12.21 l.OOOe- 16 62-77 
BL01207E 13.70 1.214e-ll 487-503 


1073 


PR00049 


WILM'S TUMOUR PROTEIN SIGNATURE 


PR00049D 0.00 3.898e-09 515-529 


1073 


BL00291 


Prion protein. 


BL00291A 4.49 7.724e-09 530-564 


1073 


PR00829 


MAJOR POLLEN ALLERGEN LOL PI 
FAMILY SIGNATURE 


PR00829E 10.81 9.597e-09 306-320 


1075 


PF00777 


Sialyltransferase family. 


PF00777C 18.60 2.581e-28 294-348 


1078 


BL01177 


Anaphylatoxin domain proteins. 


BL01177E 20.64 4.541e-13 790-816 


1078 


BL00477 


Alpha-2-macroglobulin family tbiolester 
region proteins. 


BL00477J 19.04 3.382e-27 1241-1271 
BL00477F 17.34 8.500e-25 785-814 
BL00477G 19.43 8.826e-23 983-1014 
BL00477A 13 50 9 800e-23 122-150 
BL00477L 23.51 5.500e-16 1437-1469 
BL00477K 17.42 4.529e-14 1382-1405 
BL00477E 17.53 6.538e-13 755-775 
BL00477B 9.05 6.625e-13 209-221 
BL004771 18.76 2.650e-12 1085-1111 
BL00477D 12.73 4.073e-12 729-738 
BL00477H 9.07 5.395e-12 1054-1065 
BL00477C 15.70 1.161e-10 236-252 


1079 


BL01177 


Anaphylatoxin domain proteins. 


BL01177E 20.64 4.541e-13 804-830 


1079 


BL00477 


Alpha-2-macroglobulin family tbiolester 
region proteins. 


BL00477F 17.34 8.500e-25 799-828 
BL00477A 13.50 9.800e-23 135-163 
BL00477E 17.53 6.538e-13 769-789 
BL00477B 9.05 6.625e-13 222-234 
BL00477D 12.73 4.073e-12 743-752 
BL00477C 15.70 1.161e-10 249-265 


1080 


BL0Q477 


Alpha-2-macroglobulin family tbiolester 
region proteins. 


BL00477A 13.50 9.800e-23 122-150 
BL00477B 9.05 6.625e-13 209-221 
BL00477C 15.70 1.161e-10 236-252 


1081 


BL01177 


Anaphylatoxin domain proteins. 


BL01177E 20.64 4.541e-13 790-816 


1081 


BL00477 


Alpha-2-macroglobulin family tbiolester 
region proteins. 


BL00477J 19.04 3.382e-27 1241-1271 
BL00477F 17.34 8.500e-25 785-814 
BL00477G 19.43 8.826e-23 983-1014 
BL00477A 13.50 9.800e-23 122-150 
BL00477L 23.51 8.800e-22 1437-1469 
BL00477K 17.42 4.529e-14 1382-1405 
BL00477E 17.53 6.538e-13 755-775 
BL00477B 9.05 6.625e-13 209-221 
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BL004771 18.76 2.650e- 12 1085-1111 
BL00477D 12.73 4.073e-12 729-738 
BL00477H 9.07 5.395e-12 1054-1065 
BL00477C 15.70 1.161e-10 236-252 


1081 


BL00115 


Eukaryotic RNA polymerase II heptapeptide 
repeat proteins. 


BL00115V 21.32 5.745e-09 1422-1471 


1082 


BLOl 177 


Anaphylatoxin domain proteins. 


BL01177E 20.64 4.541e-13 791-817 


1082 


BL00477 


Alpha-2-macroglobulin family tbiolester 
region proteins. 


BL00477F 17.34 8.500e-25 786-815 
BL00477A 13.50 9.800e-23 122-150 
BL00477E 17.53 6.538e-13 756-776 
BL00477B 9.05 6.625e-13 209-221 
BL00477D 12.73 4.073e-12 730-739 
BL00477C 15.70 1.161e-10 236-252 


1083 


BL00122 


Carboxylesterases type-B serine proteins. 


BL00122E 22.02 9.027e-31 195-235 
BL00122A 12.04 5.500e-16 60-80 
BL00122D 12.53 7.545e-16 171-186 
BL00122C7.91 8.125e-13 142-152 
BL00122B 16.84 4.830e-10 122-132 
BL00122F 1 1.10 5.500e-10 247-256 
BL00122G 11.67 9.625e-10 500-510 


1083 


PR00878 


CHOLENESTERASE SIGNATURE 


PR00878F 5.37 7.171e-09 460-472 


1084 


PD00919 


CALCIUM-BINDING PRECURSOR 
SIGNAL R. 


PD00919B 9.47 7.485e-10 1019-1033 


1084 


BL00203 • 


Vertebrate metallotbioneins proteins. 


BL00203 13.94 9.138e-10 175-220 


1084 


BL00279 


Membrane attack complex components / 
perforin proteins. 


BL00279E 37.1 1 9.241e-10 387-434 


1084 


PR00011 


TYPE IE EGF-LDCE SIGNATURE 


PR00011D 14.03 2.696e-09 413-431 


1084 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907G 11.63 7.973e-09 890-916 


.1084 


PR00049 


WILMS TUMOUR PROTEIN SIGNATURE 


PR00049D 0.00 8.017e-09 92-106 


1084 


BL00022 


EGF-like domain proteins. 


BL00022B 7.54 8.200e-09 1 1 87- 1 193 | 


1084 


PR00010 


TYPE H EGF-LIKE SIGNATURE 


PR00010C 11.16 7.667e-ll 1183-1193 
PR00010C 11.16 1.857e-09 937-947 
PR00010C 11.16 4.857e-09 1687-1697 
PR00010C 11.16 8.286e-09 1642-1652 


1084 


PR00009 


TYPE I EGF SIGNATURE 


PR00009C 14.11 9.118e-09 1058-1069 


1084 


BLOl 187 


Calcium-binding EGF-like domain proteins 
pattern proteins. 


BL01187B 12.04 7.000e-17 1682-1697 
BL01187B 12.04 2.350e-14 1178-1193 
BL01187B 12.04 5.500e-14 1136-1151 
BL01187B 12.04 1.391e-13 642-657 
BL01187B 12.04 4.130e-13 1219-1234 
BL01187B 12.04 4.913e-13 1095-1110 
BL01187B 12.04 9.609e-13 932-947 
BL01187B 12.04 9.667e- 12 1054-1069 
BL01187B 12.04 4.600e-ll 1261-1276 
BL01187A9.98 9.526e-ll 997-1008 
BL01187B 12.04 1.257e-10 1483-1498 
BL01187A 9.98 7.857e-l0 1078-1089 
BL01187A 9.98 2.875e-09 1243-1254 
BL01187B 12.04 3.250e-09 1637-1652 
BL01187A 9.98 7.000e-09 914-925 
BL01187A9.98 1.000e-08 1037-1048 
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1086 



PR00014 



FIBRONECTTN TYPE ffl REPEAT 
SIGNATURE 



PR00014A 8.22 8.941e-10 816-825 
PR00014D 12.04 5.950e-09 872-886 
PR00014C 15.44 6.478e-09 854-872 



1086 



BL00790 



Receptor tyrosine kinase class V proteins. 



BL007901 20.01 6.250e-12 865-895 
BL007901 20.01 7.750e-09 662-692 



1087 



1087 



PD01066 



BL00028 



PROTEIN ZINC FINGER ZINC-FINGER 
METAL-BINDING NU. 



PD01066 19.43 2.737e-24 16-54 



Zinc finger, C2H2 type, domain proteins. 



BL00028 16.07 4 
BL00028 16.07 7 
BL00028 16.074 
BL00028 16.07 2 



150e-13 219-235 
300e-13 191-207 
522e-12 163-179 
,038e-ll 247-263 



1087 



PD00066 



PROTEIN ZINC-FINGER METAL-BINDL 



PD00066 13.92 7. 
PD00066 13.92 6. 
PD0006613.92 7. 



231e-15 235-247 
143e-12 179-191 
923e-10 207-219 



1087 



PR00048 



C2H2-TYPE ZINC FINGER SIGNATURE 



PR00048A 10.52 
PR00048A 10.52 
PR00048A 10.52 
PR00048B 6.02 3 
PR00048A 10.52 
PR00048B 6.02 9 



3.250e-14 188-201 
4.000e-14 244-257 
4.706e-12 216-229 
.250e-10 232-241 
2.440e-09 160-173 
053e-09 260-269 



1088 



PD01066 



PROTEIN ZINC FINGER ZINC-FINGER 
METAL-BINDING NU. 



PD01066 19.43 2.737e-24 16-54 



1088 



1088 



BL00028 



PR00048 



Zinc finger, C2H2 type, domain proteins. 



BL00028 16.07 8.043e-12 163-179 



C2H2-TYPE ZINC FINGER SIGNATURE 



PR00048A 10.52 2.800e-09 160-173 
PR00048B 6.02 9.053e-09 176-185 



1089 



BL00243 



Integrins beta chain cysteine-rich domain 
proteins. 



BL00243I31.77 1.127e-09 86-128 
BL002431 31.77 2.775e-09 30-72 
BL002431 31.77 5.437e-09 89-131 



1089 



BL01208 



VWFC domain proteins. 



BL01208B 15.83 5.865e-09 114-128 



1089 



PD02283 



PROTEIN SPORULATION REPEAT 
PRECU. 



PD02283C 17.54 5.613e-09 24-51 
PD02283C 17.54 5.613e-09 68-95 
PD02283C 17.54 7.188e-09 93-120 
PD02283C 17.54 7.750e-09 103-130 



1089 



BL00269 



Mammalian defensins proteins. 



BL00269C 16.52 9.289e-09 28-56 
BL00269C 16.52 9.289e-09 72-100 



1089 



BL00203 



Vertebrate metallothioneins proteins. 



BL00203 
BL00203 
BL00203 
BL00203 
BL00203 
BL00203 
BL00203 
BLO0203 
BL00203 
BL00203 
BL00203 
BL00203 
BL00203 
BL00203 
BL00203 
BL00203 



13.94 6. 
13.94 3. 
13.94 4. 
13.94 6. 
13.94 1. 
13.94 2. 
13.94 2. 
13.94 5. 
13.94 5. 
13.94 7. 
13.94 1 
13.94 1 
13.94 2 
13.94 3 
13.94 4 
13.94 5 



897e-12 
769e-ll 
165e-ll 
835e-ll 
096e-10 
723e-10 
723e-10 
213e-10 
883e-10 
032e-10 
643e-09 
.735e-09 
.745e-09 
.388e-09 
.214e-09 
.500e-09 



66-111 

70-115 

40-85 

65-110 

61-106 

21- 66 

22- 67 
91-136 
26-71 
114-159 
85-130 
105-150 

80- 125 
56-101 

81- 126 
60-105 
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BL00203 13.94 6.694e-09 100-145 
BL00203 13.94 6.969e-09 17-62 
BL00203 13.94 7.612e-09 47-92 
BL00203 13.94 7.704e-09 101-146 
BL00203 13.94 8.53 le-09 75420 
BL00203 13.94 8.7 14e-09 95-140 
BL00203 13.94 9.541e-09 25-70 


1090 


PD02283 


PROTEIN SPORULATION REPEAT 
PRECU. 


PD02283C 17.54 5.613e-09 28-55 


1090 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 3.069e-12 26-71 
BL00203 13.94 6.266e-10 30-75 
BL00203 13.94 4.398e-09 21-66 
BL00203 13.94 8.071e-09 25-70 


1090 


BL00269 


Mammalian defensins proteins. 


BL00269C 16.52 9.289e-09 32-60 


1091 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL002431 31.77 8.676e-10 121-163 
BL002431 31.77 3.915e-09 124-166 
BL002431 31.77 5.690e-09 30-72 


1091 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 5.865e-09 149-163 


1091 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 3.670e-ll 66-111 j 
BL00203 13.94 4.659e-ll 40-85 
BL00203 13.94 7.429e-ll 70-115 
BL00203 13.94 1.862e-10 105-150 
BL00203 13.94 2.723e-10 21-66 
BL00203 13.94 2.723e-10 61-106 
BL00203 13.94 2.915e-10 126-171 
BL00203 13.94 4.064e-10 22-67 
BL00203 13.94 6.457e-10 26-71 
BL00203 13.94 7.032e-10 149-194 
BL00203 13.94 7.3 19e-10 95-140 
BL00203 13.94 1.735e-09 140-185 
BL00203 13.94 1.827e-09 115-160 
BL00203 13.94 1. 9 18e-09 80-125 
BL00203 13.94 3.020e-09 100-145 
BL00203 13.94 3.204e-09 65-110 
BL00203 13.94 4.306e-09 120-165 
BL00203 13.94 5.041e-09 47-92 
BL00203 13.94 5.500e-09 116-161 
BL00203 13.94 6.694e-09 135-180 
BL00203 13.94 6.969e-09 17-62 
BL00203 13.94 7.429e-09 71-116 
BL00203 13.94 7.704e-09 136-181 
BL00203 13.94 8.163e-09 85-130 
BL00203 13.94 8.714e-09 130-175 


1091 


PD02283 


PROTEIN SPORULATION REPEAT 
PRECU. 


PD02283C 17.54 5.613e-09 24-51 
PD02283C 17.54 5.613e-09 68-95 
PD02283C 17.54 7.188e-09 128-155 
PD02283C 17.54 7.750e-09 138-165 
PD02283C 17.54 8.875e-09 123-150 


1091 


BL00269 


Mammalian defensins proteins. 


BL00269C 16.52 9.289e-09 28-56 
BL00269C 16.52 9.289e-09 72-100 


1091 


BL00799 


Granulins proteins. 


BL00799D 12.41 7.661e-09 49-95 
BL00799G9.41 1.000e-08 39-79 
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1094 


PR00248 


METABOTROPIC GLUTAMATE GPCR 
SIGNATURE 


PR00248A 9.91 7.522e-09 24-45 


1094 


PR00354 


7FE FERREDOXIN SIGNATURE 


PR00354C 5.72 8.157e-09 258-275 


1096 


PR00356 


TYPE II ANTIFREEZE PROTEIN 
SIGNATURE 


PR00356G 10.80 9.862e-ll 193-206 


1096 


BL00615 


C-type lectin domain proteins. 


BL00615B 12.25 2.731e-09 193-206 
BL00615A 16.68 9.400e-09 94-111 


1097 


PR00356 


TYPE H ANTIFREEZE PROTEIN 
SIGNATURE 


PR00356G 10.80 7.658e-09 193-206 


1097 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 9.400e-09 94-111 


1098 


PR00245 


OLFACTORY RECEPTOR SIGNATURE 


PR00245A 18.03 6.870e-24 59-80 
PR00245C 7.84 2.421e-19 238-253 
PR00245E 12.40 8.714e-16 291-305 
PR00245D 10.47 6.786e-13 274-285 
PR00245B 10.38 6.906e-13 177-191 


1098 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 8.839e-15 90-129 
BL00237D 11.23 2.364e-09 282-298 


1098 


PR00237 


RHODOPSIN-LIKE GPCR SUPERFAMCLY 
SIGNATURE 


PR00237B 13.50 1.750e-09 59-80 
PR00237C 15.69 4.600e-09 104-126 ! 
PR00237A 11.48 5.065e-09 26-50 
PR00237G 19.63 5.605e-09 272-298 


1098 


PR00023 


ZONA PELLUCIDA SPERM-BINDING 
PROTEIN SIGNATURE 


PR00023E 22.27 9.813e-09 128-145 


1099 


DM00191 


w SPAC8A4.04C RESISTANCE 
SPAC8A4.05C DAUNORUBICIN. 


DM00191D 13.94 9.083e-10 163-201 


1099 


PR00346 


TISSUE FACTOR SIGNATURE 


PR00346H 10.74 8.179e-09 542-565 


1099 


BL00022 


EGF-like domain proteins. 


BL00022B 7.54 1.000e-08 306-312 


1100 


DM00372 


CARCINOEMBRYONIC ANTIGEN 
PRECURSOR AMINO-TERMINAL 
DOMAIN. 


DM00372B 20.31 8.920e-15 363-407 
DM00372B 20.31 3.329e-12 68-112 


1101 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 3.250e-10 1436-1450 


1101 


PR00457 


ANIMAL HAEM PEROXIDASE 
SIGNATURE 


PR00457E 20.67 3.1 18e-22 997-1023 
PR00457D 16.81 4.194e-21 972-992 
PR00457C 19.25 1.675e-13 954-972 
PR00457H 15.90 5.680e- 13 1248-1262 
PR00457F 13.69 4.750e-12 1050-1060 
PR00457G 17.45 8.615e-12 1177-1197 
PR00457B 13.29 3.41 le-10 802-817 


1101 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 1.000e-09 349-372 


1101 


PD01270 


RECEPTOR FC IMMUNOGLOBULIN 
AFFIN. 


PD01270A 17.22 7.677e-09 328-367 


1 101 


PR00019 

J. Avvvv/vy i ✓ 


LEUCINE-RICH REPEAT SIGNATURE 


PR00019B 11.36 8.920e-09 73-86 


1102 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 3.250e-10 1412-1426 


1102 


PR00457 


ANIMAL HAEM PEROXIDASE 
SIGNATURE 


PR00457E 20.67 3.118e-22 973-999 
PR00457D 16.81 4.194e-21 948-968 
PR00457C 19.25 1.675e-13 930-948 
PR00457H 15.90 5.680e-13 1224-1238 
PR00457F 13.69 4.750e- 12 1026-1036 
PR00457G 17.45 8.615e-12 1153-1173 
PR00457B 13.29 3.41 le-10 778-793 


1102 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 1.000e-09 325-348 
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1102 


PR00019 


LEUCINE-RICH REPEAT SIGNATURE 


PR00019B 1 1.36 7.480e-09 73-86 


1102 


PD01270 


RECEPTOR FC IMMUNOGLOBULIN 
AFFIN. 


PD01270A 17.22 7.677e-09 304-343 


1103 


BL00815 


Alpha-isopropylmalate and homocitrate 
synthases proteins. 


BL00815C 21.36 3.118e-09 786-814 


1107 


PD02059 


CORE POLYPROTEIN PROTEIN GAG 
CONTAINS: P. 


PD02059B 24.48 8.352e-09 682-716 


1113 


BL00107 


Protein kinases ATP-binding region proteins. 


BL00107A 18.39 6.885e-12 311-341 


1113 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 7.750e-09 311-329 


1117 


PD01652 


RECEPTOR CELL NK GLYCOPROTEIN 
IMMUNOGLOB. 


PD01652B 8.50 4.021e-09 99-150 
PD01652B 8.50 5.050e-09 2-53 
PD01652A 15.35 7.769e-09 12-47 


1120 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 1.750e-12 1026-1044 


1120 


PR00452 


SIB DOMAIN SIGNATURE 


PR00452B 11.65 4.115e-ll 1036-1051 


1120 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 3.000e-10 954-963 
PF00023A 16.03 2.286e-09 925-940 


1120 


PD00078 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR 


PD00078B 13.14 8.000e- 11 951-963 
PD00078B 13.14 4.522e-09 918-930 


1120 


PF0079I 


Domain present in ZO-1 and Unc5-like netrin 
receptors. 


PF00791B 28.49 8.024e-16 925-979 
PF00791C 20.98 4.971e-09 939-977 


1120 


PR00499 


NEUTROPHIL CYTOSOL FACTOR 2 
SIGNATURE 


PR00499D 10.18 6.965e-09 1024-1044 


1122 


PF00992 


Troponin. 


PF00992A 16.67 8.461e-09 245-279 


1124 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.000e-ll 69-84 
PF00023B 14.20 2.636e-09 131-140 


1124 


PD00078 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR 


PD00078B 13.14 6.087e-09 128-140 


1124 


PF00791 


Domain present in ZO-1 and Unc5-like netrin 
receptors. 


PF00791B 28.49 2.569e-09 135-189 
PF00791B 28.49 9.835e-09 69-123 


1125 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.000e-ll 69-84 
PF00023B 14.20 2.636e-09 131-140 


1125 


PD00078 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR 


PD00078B 13.14 6.087e-09 128-140 


1125 


PF00791 


Domain present in ZO-1 and Unc5-like netrin 
receptors. 


PF00791B 28.49 2.569e-09 135-189 
PF00791B 28.49 9.835e-09 69-123 


1128 


PR00248 


METABOTROPIC GLUTAMATE GPCR 
SIGNATURE 


PR00248G 12.67 2.688e-09 53-77 


1129 


DM00516 


186 DISCOEDIN I N-TERMINAL. 


DM00516 30.53 8.606e-13 131-175 


1130 


DM00516 


186 DISCOJDIN I N-TERMINAL. 


DM00516 30.53 8.606e-13 131-175 


1130 


DM01077 


SEX HORMONE-BINDING GLOBULIN. 


DM01077A 16.30 3.143e-ll 386-432 


1132 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. . 


BL002431 31.77 4.930e-09 87-129 


1133 


BL00107 


Protein kinases ATP-binding region proteins. 


BL00107B 13.31 5.909e-13 195-210 


1133 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109D 17.04 7.609e-09 196-218 
PR00109B 12.27 9.297e-09 126-144 


1135 


PR00402 


TEC/BTK DOMAIN SIGNATURE 


PR00402A 16.09 2.950e- 10 664-683 


1135 


BL00509 


Ras GTPase-activating proteins. 


BL00509B 10.28 9.800e-09 502-512 


1137 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907B 11.29 3.959e-ll 168-184 


1137 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 3.893e-10 333-365 
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DM00215 19.43 4.054e-10 328-360 
DM00215 19.43 8.232e-10 332-364 


1137 


BL01187 


Calcium-binding EGF-like domain proteins 
pattern proteins. 


BL01187B 12.04 2.957e-13 134-149 
BL01187B 12.04 3.739e-13 261-276 
BL01 187B 12.04 2.333e-12 216-231 
BL01187A 9.98 3.250e-09 197-208 


1137 


PR00049 


WILM'S TUMOUR PROTEIN SIGNATURE 


PR00049D 0.00 3.288e-09 348-362 
PR00049D 0.00 3.288e-09 350-364 


1137 


BL01177 


Anaphylatoxin domain proteins. 


BL01177C 17.39 4.714e-09 128-146 


1137 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL00243H 17.53 5.855e-09 63-88 


1137 


PF00094 


von Willebrand factor type D domain 
proteins. 


PF00094A 11.09 9.022e-09 163-172 


1137 


BL00022 


EGF-like domain proteins. 


BL00022B 7.54 9.100e-09 75-81 


1137 


PR00910 


LUTEOVERUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 9.357e-09 348-360 


1143 


PR00245 


OLFACTORY RECEPTOR SIGNATURE 


PR00245C 7.84 5.355e-17 121-136 
PR00245B 10.38 3.919e-12 60-74 
PR00245E 12.40 1.000e-10 174-188 


1143 


BL00237 


G-protein coupled receptors proteins. 


BL00237D 11.23 2.091e-09 165-181 


1143 


PR00237 


RHODOPSIN-LEKE GPCR SUPERFANflLY 
SIGNATURE 


PR00237G 19.63 8.714e-ll 155-181 
PR00237E 13.03 9.735e-09 82-105 


1144 


PR00245 


OLFACTORY RECEPTOR SIGNATURE 


PR00245C 7.84 5.355e-17 235-250 
PR00245A 18.03 8.615e-15 58-79 
PR00245B 10.38 3.919e-12 174-188 
PR00245E 12.40 1.000e-10 288-302 


1144 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 1.581e-15 89-128 
BL00237D 11.23 2.091e-09 279-295 


1144 


PR00896 


VASOPRESSIN RECEPTOR SIGNATURE 


PR00896B 9.01 8.962e-09 54-65 


1144 


PR00237 


RHODOPSIN-LKE GPCR SUPERFAMILY 
SIGNATURE 


PR00237G 19.63 8.714e-ll 269-295 
PR00237C 15.69 3.829e-10 103-125 
PR00237E 13.03 9.735e-09 196-219 


1146 


BL00914 


Syntaxin / epimorphin family proteins. 


BL00914 24.91 6.172e-09 168-217 


1147 


PR00264 


INTERLEUKIN- 1 SIGNATURE 


PR00264B 20.98 8.453e-ll 56-82 
PR00264C 17.77 1.851e-10 96-124 


1148 


BL00226 


Intermediate filaments proteins. 


BL00226B 23.86 5.050e-24 96-143 
BL00226D 19.10 8.200e-18 262-308 
BL00226C 13.23 5.610e-14 161-191 
BL00226A 12.77 5.065e-13 380-394 


1151 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 5.500e-38 367-413 
BL00226C 13.23 4.130e-23 266-296 
BL00226A 12.77 9.129e-13 131-145 
BL00226B 23.86 1.338e-10 183-230 


1152 


PR00138 


MATRDQN SIGNATURE 


PR00138A 15.14 7.136e-16 86-99 
PR00138B 15.82 3.824e-ll 131-146 


1152 


BL00546 


Matrixins cysteine switch. 


BL00546A 19.62 7.667e-26 66-95 
BL00546E 10.23 3.475e-19 231-251 
BL00546B20.il 7.720e-19 155-198 
BL00546F 12.40 6.400e-13 268-280 
BL00546G 16.84 9.449e-ll 288-307 


1152 


BL00024 


Hemopexin domain proteins. 


BL00024B 21.53 3.143e-23 105-138 
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BL00024C 22.98 8.320e-20 154-202 
BL00024F 11.30 2.184e-18 231-251 
BL00024G 13.31 6.192e-13 268-280 
BL00024A 11.49 9.100e-13 86-96 
BL00024H 11.35 8.154e-10 335-346 


1153 


PR00138 


MATRDQN SIGNATURE 


PR00138A 15.14 7.136e-16 86-99 
PR00138B 15.82 3.824e-ll 131-146 


1153 


BL00546 


Matrixins cysteine switch. 


BL00546A 19.62 7.667e-26 66-95 
BL00546E 10.23 3.475e- 19 231-251 
BL00546B 20.11 7.720e-19 155-198 
BL00546F 12.40 6.400e-13 268-280 
BL00546G 16.84 9.449e-ll 288-307 


1153 


BL00024 


Hemopexin domain proteins. 


BL00024B 21.53 3.143e-23 105-138 
BL00024C 22.98 8.320e-20 154-202 
BL00024F 11.30 2.184e-18 231-251 
BL00024G 13.31 6.192e-13 268-280 
BL00024A 11.49 9.100e-13 86-96 
BL00024H 11.35 8.154e-10 335-346 


1154 


PR00049 


WILM'S TUMOUR PROTEIN SIGNATURE 


PR00049D 0.00 2.068e-09 10-24 


1155 


BL00400 


LBP / BPI / CETP family proteins. 


BL00400C 24.53 6.029e-17 210-253 
BL00400D 23.26 2.080e-14 274-310 
BL00400A 21.59 1.600e-10 27-58 


1156 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A9.37 1.700e-19 90-128 
PD02448B 10.17 2.311e-17 129-176 


1156 


BL00415 


Synapsins proteins. 


BL00415O 3.44 7.395e-09 22-59 


1159 


BL00347 


Poly(ADP-ribose) polymerase zinc finger 
domain proteins. 


BL00347A 12.35 9.795e-15 93-135 


1159 


BL00697 


ATP-dependent DNA bgase AMP-binding 
site proteins. 


BL00697D 18.99 1.346e-23 591-617 
BL00697A 21.27 2.929e-19 471-499 
BL00697B 13.40 4.774e-14 506-517 


1160 


BL00284 


Seipins proteins. 


BL00284C 28.56 7.600e-25 203-244 
BL00284E 19.15 4.375e-23 401-425 
BL00284D 16.34 5.286e-21 317-343 
BL00284A 15.64 6. 192e-17 27-50 
BL00284B 17.99 4.414e-13 174-194 j 


1166 


BL01121 


Caspase family bistidine proteins. 


BL01121A9.il 5.500e-13 7-17 


1166 


PR00376 


INTERLEUKIN- IB CONVERTING 
ENZYME SIGNATURE 


PR00376A 14.23 7.980e-ll 5-18 


1167 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR. 


PD02870D 15.74 7.000e-10 79-1 13 j 


1167 


PD01652 


RECEPTOR CELL NK GLYCOPROTEIN 
IMMUNOGLOB. 


PD01652B 8.50 3.143e-29 209-260 
PD01652B 8.50 5.457e-18 107-158 i 
PD01652A 15.35 6.438e-14 117-152 
PD01652A 15.35 3.732e-10 24-59 
PD01652B 8.50 7.448e-10 14-65 
PD01652A 15.35 4.231e-09 219-254 


1169 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 7.231e-10 125-142 


1171 


PR00308 


TYPE I ANTIFREEZE PROTEIN 
SIGNATURE 


PR00308A5.90 9.156e-13 158-172 
PR0O308C3.83 6.640e-12 161-170 
PR00308B4.28 1.806e-10 161-172 
PR00308A 5.90 4.873e-10 162-176 
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PR00308C 3.83 8.062e-10 165-174 



1171 



1171 



PR00456 



BL00678 



PJBOSOMAL PROTEIN P2 SIGNATURE 



PR00456E 3.06 5.671e-09 163-177 



Trp-Asp (WD) repeat proteins proteins. 



BL00678 9.67 2.800e-10 429-439 
BL00678 9.67 5.263e-09 480-490 
BL00678 9.67 6.21 le-09 249-259 



1171 



PR00833 



POLLEN ALLERGEN POA PI 
SIGNATURE 



PR00833H2.30 7.750e-10 164-178 
PR00833H 2.30 7.923e-09 161-175 



1171 



PR00320 



G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 



PR00320A 
PR00320B 
PR00320A 
PR00320C 
PR00320C 
PR00320A 
PR00320B 
PR00320C 
PR00320B 
PR00320B 
PR00320C 
PR00320A 
PR00320A 
PR00320C 



16.74 4 
12.19 8. 
16.74 5 
13.01 6. 
13.01 9. 
16.74 9 
12.19 3. 
13.01 6. 
12.19 6. 
12.19 1. 
13.01 2. 
16.74 4. 
16.74 6. 
13.01 1. 



.000e-13 
269e-12 
.966e-ll 
478e-ll 
217e-ll 
.690e-ll 
057e-10 
040e-10 
657e-10 
450e-09 
500e-09 
732e-09 
488e-09 
000e-08 



427-441 
478-492 
478-492 
478-492 
427-441 
247-261 
247-261 
247-261 
427-441 
520-534 
303-317 
520-534 
344-358 
344-358 



1172 



PD01652 



RECEPTOR CELL NK GLYCOPROTEIN 
IMMUNOGLOB. 



PD01652A 15.35 6.625e-10 24-60 
PD01652B 8.50 1.836e-09 14-66 
PD01652B 8.50 4.021e-09 111-163 



1173 



PD01652 



RECEPTOR CELL NK GLYCOPROTEIN 
IMMUNOGLOB. 



PD01652A 15.35 6.625e-10 24-60 
PD01652B 8.50 1.836e-09 14-66 
PD01652B 8.50 4.021e-09 111-163 



1183 



PD02876 



DECARBOXYLASE 
PHOSPHATEDYLSERINE. 



PD02876C 8.80 2.723e- 13 316-328 
PD02876D 12.13 2.588e-12 427-443 



1184 



BL01289 



TSC-22 / dip / bun family proteins. 



BL01289A 12.18 8.200e-33 124-150 
BL01289B 10.45 8.071e-30 151-180 



1184 



DM00475 



w LOW TRANSPOSASE SAPA 12K. 



DM00475B 12.12 5.891e-10 145-164 



1187 



PR00901 



PHEROMONE B ALPHA- 1 RECEPTOR 
SIGNATURE 



PR00901H 14.99 4.706e-09 56-66 



1188 



BL00708 



Prolyl endopeptidase family serine proteins. 



BL00708B 24.91 7.197e-12 734-764 



1188 



PF00930 



Dipeptidyl peptidase IV (DPP IV) N-terminal 
region. 



PF009301 15.96 6.373e-17 776-803 
PF00930H 20.16 2.482e-13 697-739 
PF00930J8.78 1.000e-ll 828-848 
PF00930G 21.30 9.613e-09 657-694 



1189 



BL00708 



Prolyl endopeptidase family serine proteins. 



BL00708B 24.91 7.197e-12 734-764 



1189 



PF00930 



Dipeptidyl peptidase IV (DPP IV) N-terminal 
region. 



PF00930H 20.16 2.482e-13 697-739 
PF00930J 8.78 1.000e-ll 790-810 
PF00930G 21.30 9.613e-09 657-694 



1190 



BL00708 



Prolyl endopeptidase family serine proteins. 



BL00708B 24.91 7.197e-12 721-751 



1190 



PF00930 



Dipeptidyl peptidase IV (DPP TV) N-terminal 
region. 



PF009301 15.96 6.373e-17 763-790 
PF00930H 20.16 2.482e-13 684-726 
PF00930J 8.78 1.000e-ll 815-835 
PF00930G 21.30 9.613e-09 644-681 



1193 



PF00791 



Domain present in ZO-1 and Unc5-like netrin 
receptors. 



PF00791B 28.49 6.612e-15 153-207 
PF00791B 28.49 7.955e-14 186-240 
PF00791B 28.49 3.6S3e-12 436-490 
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PF00791B 28.49 9.337e-12 54-108 
PF00791B 28.49 4.273e-ll 319-373 
PF0079 IB 28.49 7.8 18e- 1 1 252-306 
PF00791B 28.49 1.524e-10 219-273 
PF00791R 28 49 2 398e-10 120-174 
PF00791C 20.98 3.559e-09 200-238 
PF00791C 20.98 5.235e-09 333-371 
PF00791C 20.98 5.235e-09 544-582 
PF00791B 28.49 6.202e-09 352-406 
PF00791B 28.49 7.028e-09 598-652 
PF00791C 20.98 7.265e-09 101-139 
PF00791B 28.49 8.679e-09 530-584 
PF00791B 28.49 l.OOOe-08 87-141 


1193 


PD00078 


REPEAT PROTEIN ANK NUCLEAR 


PD00078B 13.14 4.600e-12 345-357 
pr>nnn78n 1^149 flfiftp-i 1 4*59-474 

PD00078B 13.14 3.500e-ll 796-808 
PD00078B 13.14 8.500e-ll 863-875 
PD00078B 13.14 4.600e-10 495-507 
PD00078B 13.14 5.950e-10 760-772 
PD00078B 13.14 4.522e-09 212-224 
PD00078B 13.14 6.087e-09 278-290 
PD00078B 13.14 l.OOOe-08 146-158 
PD00078B 13.14 l.OOOe-08 245-257 


1193 


PF00023 


Auk repeat proteins. 


PF00023A 16.03 2.500e-12 186-201 
PF00023B 14.20 5.154e-U 465-474 
PF00023B 14.20 5.154e-ll 763-772 
PF00023A 16.03 6.571e-ll 153-168 
PF00023A 16.03 1.750e-10 54-69 
PF00023B 14.20 8.000e-10 866-875 
PF00023B 14.20 1.409e-09 348-357 ! 
PF00023B 14.20 2.636e-09 281-290 
PF00023A 16.03 3.250e-09 219-234 
PF00093R 14 20 3 455e-09 498-507 
PF00023B 14.20 3.864e-09 799-808 
PF00023A 16.03 4.536e-09 252-267 
PF00023B 14.20 5.500e-09 248-257 
PF00023A 16.03 6.464e-09 598-613 
PF00023B 14.20 7.955e-09 432-441 
PF00023A 16 03 8 071e-09 631-646 
PF00023A 16.03 8.071e-09 767-782 
PF00023A 16.03 l.OOOe-08 701-716 






utd A/TYPO O PR OTP A <5F F AMTT Y 

SIGNATURE 


PR00834C 15 43 6 226e-20 253-277 
PR00834D 12.14 4.316e-17 291-308 
PR00834B 10.09 7.188e-14 212-232 
PR00834E 13.63 1.000e-12 313-330 
PR00834A 9.80 5.737e-12 191-203 
PR0O834F 10.91 1.730e-09 374-386 


1195 


PR00555 


ADENOSINE A3 RECEPTOR SIGNATURE 


PR0O555E 11.12 5.629e-20 105-122 
PR00555F 11.18 6.114e-20 152-169 
PR00555D 10.11 4.7l7e-l8 60-76 


1195 


PR00237 


RHODOPSIN-LDCE GPCR SUPERFAMILY 
SIGNATURE 


PR00237G 19.63 8.560e-15 119-145 
PR00237F 13.57 3.520e-l3 83-107 
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PR00237E 13.03 4.960e-12 33-56 


1195 


PR00424 


ADENOSINE RECEPTOR SIGNATURE 


PR00424D 14.32 9.400e-23 21-40 
PR00424E 15.73 6.211e-14 74-87 
PR00424F 8.50 9.156e-12 119-129 




RT 00237 


G-nrotein couDled receptors proteins. 


BL00237C 13.19 3.864e-15 78-104 
BL00237D 11.23 1.346e-ll 129-145 


1 197 


RT 00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 3.455e-14 95-134 


1197 


PR00237 


RHODOPSIN-LIKE GPCR SUPERFAMELY 
SIGNATURE 


PR00237C 15.69 1.257e-10 109-131 
PR00237E 13.03 9.100e-10 204-227 


1197 


PR00245 


OLFACTORY RECEPTOR SIGNATURE 


PR00245A 18.03 9.581e-18 64-85 
PR00245C 7.84 4.780e-13 243-258 
PR00245E 12.40 6.741e-09 296-310 
PR00245B 10.38 8.163e-09 182-196 


1197 


PR00534 


MELANOCORTIN RECEPTOR FAMILY 
SIGNATURE 

UXVJl li A- A V1VU 


PR00534A 11.49 9.229e-09 56-68 


1198 


PR00505 


D12 CLASS N6 ADENINE-SPECIFIC DNA 
METHYLTRANSFERASE SIGNATURE 


PR00505A 14.15 4.857e-13 30-46 
PR00505B 11.49 1.621e-12 51-65 


1199 


PR00179 


LIPOCALIN SIGNATURE 


PR00179B9.562.071e-09 111-123 
PR00179C 19.02 9.455e-09 138-153 


1200 


PF00152 


tRNA synthetases class II. 


PF00152D 21.30 8.364e-28 431-469 
PF00152C 28.03 9.250e-21 220-256 
PF00152B 15.67 2.658e-13159-183 
PF00152A 19.68 5.714e-ll 44-66 


1202 


BL00504 


Fumarate reductase / succinate dehydrogenase 
FAD-binding site proteins. 


BL00504D 10.43 5.390e-17 31-48 


1203 


BL00720 


Guanine-nucleotide dissociation stimulators 
CT)C*25 familv sim 


BL00720B 16.57 5.065e-17 309-332 


1204 


PF00013 


KH domain proteins family of RNA binding 
proteins. 


PF00013 5.78 4.150e-09 112-123 


1206 


DM00893 


YRUVATE DEHYDROGENASE 
(LIPOAMIDE) BETA CHAIN. 


DM00893A 19.01 1.000e-40 47-93 
DM00893E 29.52 1.000e-40 234-287 
DM00893C 20.28 2.452e-40 143-184 
DM00893B 27.53 3.483e-31 105-142 
DM00893D 23.36 1.545e-26 197-230 
DM00893F 21.02 6.897e-21 292-316 


1207 


PR00312 


CALSEQUESTRIN SIGNATURE 


PR00312E 8.32 3.423e-36 163-192 
PR003121 15.78 5.286e-35 326-354 
PR00312F 15.06 5.865e-35 193-222 
PR00312H 13.31 8.313e-35 257-284 
PR00312J 13.73 5.688e-34 357-385 
PR00312D 9.43 2.636e-33 122-151 
PR00312C 15.14 8.839e-33 86-115 
PR00312B 15.08 8.941e-33 56-85 
PR00312G 11.11 6.657e-32 224-251 
PR00312A 11.70 6.914e-27 29-52 


1207 


BL00863 


Calsequestrin proteins. 


BL00863G 12.17 1.000e-40 192-233 
BL00863H 14.03 1.000e-40 240-276 
BL00863J 10.84 1.000e-40 304-341 
BL00863A 15.14 7.387e-40 28-64 
BL00863B 12.89 4.300e-32 65-92 
BL00863F 11.27 3.172e-31 161-187 
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BL008631 10.28 6.786e-31 277-303 
BL00863E 8.49 1.462e-28 135-160 
BL00863C 13.93 7.387e-24 93-1 14 
BL00863D 11.58 5.629e-19 115-132 


1209 


BL00781 


Phosphoenolpyruvate carboxylase proteins 1. 


BL00781C 12.88 7.031e-09 233-287 


1209 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR00985A 12.10 7.716e-09 5 15-532 


1209 


PR00563 


BETA-3 ADRENERGIC RECEPTOR 
SIGNATURE 


PR00563E 7.48 8.768e-09 782-800 




RT 00290 


IiTiiTiunoglobuliiis and major 
histocompatibility complex proteins. 


BL00290A 20.89 1.818e-ll 158-180 


1213 


BL00232 


Cadherins extracellular repeat proteins 
domain nroteins 


BL00232B 32.79 2. 125e-26 227-274 ! 
BL00232B 32.79 8.521e-15 440-487 
BL00232B 32.79 1.346e-13 1 18-165 1 
BL00232B 32.79 5.500e-13 335-382 
BL00232C 10.65 7.923e-10 333-350 
BL00232C 10.65 9.308e-10 438-455 
BL00232C 10.65 9.827e-10 225-242 


1213 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 3.945e-10 438-455 
PR00205B 11.39 2.220e-09 333-350 
PR00205B 11.39 9.542e-09 548-565 


1914 


PR00626 1 


CALRETICULIN SIGNATURE 


PR00626D 8.30 8.071e-30 242-264 
PR00626E 11.30 7.632e-24 280-299 
PR00626B 14.12 2.200e-20 126-142 
PR00626E 11.30 3.676e-19 266-285 
PR00626A 14.35 1.500e-18 100-118 
PR00626C 9.70 9.100e-18 215-228 
PR00626C 9.70 7.882e-14 232-245 
PR00626D 8.30 8.017e-13 256-278 
PR00626D 8.30 6.520e-09 208-230 


1214 


BL00803 


Calreticulin family proteins. 


BL00803G 14.33 1.000e-40 258-302 
BL00803F 10.95 2.000e-37 225-255 
BL00803E 16.55 2.588e-31 166-196 
BL00803C 11.13 6.063e-26 91-113 
BL00803F 10.95 7.268e-22 208-238 
BL00803G 14.33 1.127e-19 244-288 
BL00803B 17.08 8.714e-18 63-81 
BL00803D 16.08 1.000e-15 128-138 
BL00803G 14.33 3.962e-15 272-316 
BL00803A 14.83 2.688e-14 35-48 
BL00803F 10.95 2.179e-ll 191-221 
BL00803F 10.95 9.516e-09 242-272 


1215 


PF00711 


Beta defensins. 


PF0071 1 15.76 7.915e-l 1 45-77 


1215 


PD00866 


GLYCOPROTEIN PROTEIN SPIKE E2 
PRECURSOR PEPLOMER. 


PD00866L 3.73 7.709e-10 59-68 


1215 


' PRO0858 


CRUSTACEAN METALLOTfflONEIN 
SIGNATURE 


PR00858B5.93 1.479e-09 40-58 


1215 


BL00317 


WAP-type 'four-disulfide core' domain 
proteins. 


BL00317B 14.58 2.2 16e-09 48-69 


1215 


BL00264 


Neurohypophysial hormones proteins. 


BL00264 8.98 5.642e-09 79-105 


1215 


DM01724 


kw ALLERGEN POLLEN CIM1 HOL-LL 


DM01724 8.14 7.968e-12 16-35 
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DM01724 8.14 1.409e-ll 20-39 
DM01724 8.14 1.507e-10 4-23 
DM01724 8.14 6.684e-09 12-31 


1215 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL002431 31.77 2.000e-l 1 42-84- 
BL00243I31.77 1.265e-10 54-96 
BL00243I31.77 1.254e-09 45-87 
BL002431 31.77 8.225e-09 58-100 


1215 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 2.862e-12 32-77 
BL00203 13.94 3.690e-12 39-84 
BL00203 13.94 4.758e-ll 35-80 
BL00203 13.94 3.663e-09 42-87 
BL00203 13.94 5.592e-09 50-95 
BL00203 13.94 6.235e-09 36-81 
BL00203 13.94 6.786e-09 40-85 
BL00203 13.94 9.357e-09 60-105 


1218 


PR00946 


MERCURY SCAVENGER PROTEIN 
SIGNATURE 


PR00946A 5.58 6.516e-10 6-24 


1220 


DM01071 


OPACITY PROTEIN. 


DM01071A 1.92 8.990e-09 5-20 


1221 


BL00884 


Osteopontin proteins. 


BL00884C 22.45 1.000e-40 119-160 
BL00884B 12.47 4.673e-33 24-67 
BL00884A 11.35 8.615e-32 1-30 
BL00884D 8.79 4.857e-19 248-264 


1221 


PR00216 


OSTEOPONTIN SIGNATURE 


PR00216A 10.94 5.000e-35 2-31 
PR00216C9.63 1.391e-32 41-66 
PR00216G 12.39 9.550e-31 231-256 
PR00216F 11.79 3.700e-23 152-170 
PR00216E 8.44 3.250e-19 120-134 
PR00216D 2.74 1.200e-18 88-102 
PR00216D 2.74 2.209e-12 82-96 


1222 


BL00284 


Serpins proteins. 


BL00284C 28.56 6.538e-29 225-266 
BL00284A 15.64 3.739e-18 107-130 
BL00284D 16.34 3.793e-17 332-358 
BL00284E 19.15 2.909e-15 419-443 


1223 


PD02327 


GLYCOPROTEIN ANTIGEN PRECURSOR 
IMMUNOGLO. 


PD02327B 19.84 8.941e-23 143-164 
PD02327A8.89 1.000e-13 115-126 
PD02327C 15.47 5.500e-13 209-223 


1225 


PR00418 


DNA TOPOISOMERASE II SIGNATURE 


PR00418F 12.01 3.813e-20 470-486 
PR00418G 14.68 7.000e-19 488-505 
PR00418C 10.02 8.200e-18 100-114 
PR004181 16.64 4.682e-17 550-566 
PR00418A 12.34 3.739e-16 20-35 
PR00418B 12.52 6.571e-15 57-70 
•PR00418E 15.56 7.300e-15 397-411 
PR00418D 14.93 7.000e-14 252-265 
PR00418H 13.54 2.385e-12 508-520 


1225 


BL00177 


DNA topoisomerase II proteins. 


BL00177H 21.42 3.647e-39 471-506 
BL00177G 24.83 4.706e-36 417-455 
BL00177B 19.24 1.000e-35 79-114 
BL001771 21.82 2.200e-21 732-757 
BL00177F 12.98 2.500e-18 395-412 
BL00177D 14.66 9.591e-15 252-265 
BL00177E 12.43 7.000e-13 310-321 
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BL00177C 13.16 5.950e-12 155-166 


1225 


BL01190 


Ribosomal protein L36e proteins. 


BL01190B 16.17 6.929e-10 1140-1194 


1225 


PF00521 


DNA gyrase/topoisomerase IV, subunit A. 


PF00521D 9.77 9.591e-09 788-811 


1226 


BL00455 


Putative AMP-binding domain proteins. 


BL00455 13.31 6.684e-13 248-263 


1226 


PR00154 


AMP-BINDING SIGNATURE . 


PR00154A 8.88 7.375e-10 241-252 


1228 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 7.698e-13 116-135 
PR00007D 9.64 9.654e-ll 193-203 
PR00007A 19.33 2.552e-10 89-115 
PR00007C 15.60 3.656e-10 163-184 


1228 


BL01113 


Clq domain proteins. 


BL01113B 18.26 1.563e-20 95-130 
BL01113D 7.47 9.308e-12 195-204 
BL01113C 13.18 4.750e-10 163-182 


1230 


PD00078 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR. 


PD00078B 13.14 1.000e-ll 378-390 
PD00078B 13.14 4.500e-ll 495-507 
PD00078B 13.14 8.200e-10 897-909 
PD00078B 13.14 4.522e-09 528-540 


1230 


PR00665 ! 


OXYTOCIN RECEPTOR SIGNATURE 


PR00665E 5,60 5.390e-09 756-769 \ 


1230 


PF00791 


Domain present in ZO- 1 and Unc5-like netrin 


PF00791B 28.49 1.890e-13 186-240 
PF00791B 28.49 3.368e-12 469-523 
PF00791B 28.49 2.273e-ll 219-273 
PF00791B 28.49 2.922e-10 352-406 
PF00791B 28.49 3.534e-10 904-958 
PF00791C 20.98 5.361e-10 366-404 
PF00791B 28.49 8.427e-10 12-66 
PF00791B 28.49 8.951e-10 734-788 
PF00791B 28.49 2.156e-09 153-207 
PF00791B 28.49 7.028e-09 563-617 


1230 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 1.600e-13 219-234 
PF00023A 16.03 2.500e-12 252-267 
PF00023B 14.20 5.154e-ll 498-507 
PF00023A 16.03 7.750e-10 631-646 
PF00023B 14.20 8.000e-10 900-909 
PF00023A 16.03 1.321e-09 186-201 
PF00023B 14.20 1.409e-09 381-390 
PF00023A 16.03 2.607e-09 698-713 
PF00023B 14.20 4.273e-09 465-474 
PF00023A 16.03 4.536e-09 1007-1022 
PF00023B 14.20 5.500e-09 281-290 
PF00023B 14.20 7.545e-09 531-540 
PF00023A 16.03 1.000e-08 800-815 


1231 


BL00400 


LBP / BPI / CETP family proteins. 


BL00400C 24.53 6.029e-17 210-253 
BL00400D 23.26 2.080e-14 274-310 
BL00400A 21.59 1.600e- 10 27-58 


1232 


BL00400 


LBP / BPI / CETP family proteins. 


BL00400C 24.53 6.029e-17 210-253 
BL00400D 23.26 2.080e-14 274-310 
BL00400A 21.59 1.600e-10 27-58 


1233 


BL00400 


LBP / BPI / CETP family proteins. 


BL00400C 24.53 6.029e-17 210-253 
BL00400D 23.26 2.080e-14 274-310 
BL00400A 21.59 1.600e- 10 27-58 


1237 


BL00240 


Receptor tyrosine kinase class III proteins. 


BL00240B 24.70 9.809e-09 132-155 


1247 


BL01248 


Laminin-type EGF-like (LE) domain proteins. 


BL01248 11.02 1.340e-09 289-301 
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1247 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764F 16.89 6.610e-09 237-257 


1247 


BL00812 


Glycosyl hydrolases family 8 proteins. 


BL00812B 13.49 6.667e-09 917-931 


1247 


PR00011 


TYPE IE EGF-LIKE SIGNATURE 


PROOOllB 13.08 9.386e-17 767-785 
PROOOllB 13.08 8.875e-16 289-307 
PROOOllD 14.03 5.800e-15 550-568 
PROOOllD 14.03 8.000e-15 767-785 
PROOOllD 14.03 3.388e-14 289-307 
PROOOllB 13.08 7.833e-14 160-178 
PROOOllB 13.08 9.000e-14 550-568 
PROOOllA 14.06 9.345e-14 289-307 
PROOOllB 13.08 5.119e-13 203-221 
PROOOllB 13.08 5.576e-13 421-439 
PROOOllD 14.03 6.943e-13 421-439 
PROOOllB 13.08 7.102e-13 638-656 
PROOOllA 14.06 9.237e-13 203-221 
PROOOllB 13.08 9.542e-13 378-396 
PROOOllD 14.03 9.830e-13 638-656 
PROOOllD 14.03 3.211e-12 378-396 
PROOOllB 13.08 4.339e-12 810-828 
PROOOllA 14.06 6.516e-12 378-396 
PROOOllD 14.03 6.842e-l2 810-828 
PROOOllD 14.03 7.158e-12 160-178 
PROOOllA 14.06 8.548e-12 421-439 
PROOOllA 14.06 1.554e-ll 550-568 
PROOOllD 14.03 2.770e-ll 593-611 
PROOOllD 14.03 3.213e-ll 507-525 
PROOOllD 14.03 3.361e-ll 203-221 
PROOOllB 13.08 4.877e-ll 246-264 
PROOOllB 13.08 6.400e-ll 332-350 
PROOOllB 13.08 6.815e-ll 593-611 
PROOOllD 14.03 7.049e-ll 332-350 
PROOOllB 13.08 8.062e-ll 724-742 
PROOOllB 13.08 2.174e-10 507-525 
PROOOllD 14.03 2.523e-10 464-482 
PROOOllA 14.06 3.348e-10 767-785 
PROOOllD 14.03 4.462e- 10 724-742 
PROOOllA 14.06 5.304e-10 810-828 
PROOOllA 14.06 8.304e-10 638-656 
PROOOllD 14.03 8.892e-10 246-264 
PROOOllD 14.03 1.913e-09 681-699 
PROOOllB 13.08 2.356e-09 464-482 
PROOOllA 14.06 2.726e-09 160-178 
PROOOllA 14.06 2.849e-09 246-264 
PROOOllB 13.08 5.685e-09 681-699 
PROOOllA 14.06 5.808e-09 681-699 
PROOOllA 14.06 6.055e-09 724-742 
PROOOllA 14.06 6.425e-09 464-482 
PROOOllA 14.06 6.671e-09 507-525 


1247 


DM00758 


AGRIN. 


DM00758 13.12 7.485e-09 197-212 
DM00758 13.12 8.412e-09 240-255 


1247 


PR00173 


GLUTAMATE-ASPARTATE 
SYMPORTER SIGNATURE 


PR00173F 10.44 8.820e-09 859-878 
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1247 


BL00022 


EGF-Iike domain proteins. 


BL00022B 7.54 3.250e-10 210-216 
BL00022A 7.48 9.000e-09 283-289 


1247 


BL00243 


Inteerins beta chain cvsteine-rich domain 
proteins. 


BL00243H 17.53 4.671e-09 284-309 
BL00243H 17.53 7.750e-09 327-352 
BL00243H 17.53 8.816e-09 198-223 
BL00243H 17.53 9.053e-09 241-266 


1254 


BL00247 


HBGF/FGF family proteins. 


BL00247B 31.59 3.077e-35 82-128 
BL00247C 21.54 8.333e-22 137-164 


1254 


PR00262 


IL1/HBGF FAMILY SIGNATURE 


PR00262A 28.26 8.588e-ll 77-104 


1254 


PR00263 


HEPARIN BINDING GROWTH FACTOR 
FAMILY SIGNATURE 


PR00263D 12.89 5.078e-ll 106-125 
PR00263C 9.90 7.188e-10 90-102 


1260 


PR00345 


STATHMIN FAMILY SIGNATURE 


PR00345B 7.12 1.371e-ll 207-235 


1260 


BL00563 


Stathmin family proteins. 


BL00563B 6.08 6.021e-ll 213-239 


1260 


PF00780 


Domain found in NIK 1 -like kinases, mouse 
citron and yeast ROM. 


PF00780A 10.77 7.857e-10 68-76 


1260 


BL00326 


Tropomyosins proteins. 


BL00326B 7.68 1.235e-09 161-209 


1260 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194C 6.38 9.703e-09 120-148 


1261 


BL00284 


Sernins nroteins 


BL00284C 28.56 7.000e-17 212-253 
BL00284D 16.34 1.692e-13 324-350 
BL00284A 15.64 1.200e-ll 49-72 


1262 


BL00873 


Sodium:alanine symporter family proteins. 


BL00873B 20.93 9.029e-10 2-53 


1263 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.506e-20 83-133 
BL01020A 11.87 3.821e-19 7-37 
BL01020B 11.70 5.393e-15 41-75 


1263 


PR00328 


GTP-BINDING SARI PROTEIN 
SIGNATURE 


PR00328B 9.04 2.1l2e-12 55-79 
PR00328A 10.62 4.857e-12 27-50 


1265 


PR00258 


SPERACT RECEPTOR SIGNATURE 


PR00258B 9.63 2.800e-14 493-504 
PR00258C 9 05 1 257e-12 62-72 
PR00258C 9.05 7.171e-12 508-518 
PR00258D 14.41 8.500e-12 539-553 
PR00258D 14.41 8.875e-12 93-107 
PR00258A 11.46 3.418e-10 229-245 
PR00258D 14.41 5.034e- 10 294-308 
PR00258E 13.33 2.500e-09 215-227 
PR00258A 11.46 3.000e-09 133-149 
PR00258C 9.05 7.000e-09 163-173 


1265 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420B 22.67 1.000e-40 478-532 
BL00420B 22.67 7.689e-25 233-287 
BL00420B 22.67 6.625e-18 32-86 
BL00420B 22.67 8.863e-15 133-187 
BL00420B 22.67 5.585e-12 361-415 
BL00420C 11.90 8.625e-09 216-226 
BL00420C 11.90 9.000e-09 563-573 


1266 


PR00258 


SPERACT RECEPTOR SIGNATURE 


PR00258B 9.63 2.800e-14 493-504 
PR00258C 9.05 1.257e-12 62-72 
PR00258C 9.05 7.171e-12 508-518 
PR00258D 14.41 8.500e- 12 539-553 
PR00258D 14.41 8.875e-12 93-107 
PR00258A 11.46 3.418e-10 229-245 
PR00258D 14.41 5.034e-10 294-308 
PR00258E 13.33 2.500e-09 215-227 
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PR00258P Q 7 OOOp-00 163-173 




x5JLUU*fZU 


OpCiaUl ICCCpiUI lCpCdL piUlCllib UUxilaLLl 

proteins. 


BI00420R 99 67 1 000e-40 478-532 
BL00420B 22.67 7.689e-25 233-287 
BL00420B 22 67 6 625e-18 32-86 
BL00420B 22.67 8.863e-15 133-187 > 
BL00420B 22.67 5.585e-12 361-415 ^ 
BL00420C 11.90 8.625e-09 216-226 
BL00420C 11.90 9.000e-09 563-573 


1272 


PR00170 


SODIUM CHANNEL SIGNATURE 


PR00170E 6.48 8.533e-09 34-63 


1273 


BL00107 


Protein kinases ATP-binding region proteins. 


BL00107A 18.39 5.500e-21 214-244 


1273 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.294e-12 214-232 


1273 


BL00239 


Receptor tyrosine kinase class II proteins. 


BL00239B 25.15 2.935e-09 149-196 


1273 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240E 11.56 1.000e-08 200-237 


1275 


BL00427 


Disintegrins proteins. 


BL00427 13.93 7.592e-26 460-514 


1275 


PR00138 


MATRDON SIGNATURE 


PR00138D 16.56 5.101e-ll 359-384 


1275 


BL00142 


Neutral zinc metallopeptidases, zinc-binding 
region proteins. 


BL00142 8.38 7.545e-ll 359-369 


1275 


PR00289 


DISINTEGRIN SIGNATURE 


PR00289A 13.62 2.500e-14 474-493 . 
PR00289B 11.79 4.226e-10 503-515 




PR00480 


ASTACIN FAMILY SIGNATURE 


PR00480B 15.41 8.909e- 10 354-372 


1275 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907E 11.70 3.647e-09 672-694 


1275 


BL00546 


Matrixins cysteine switch. 


BL00546C 16.41 4.255e-09 353-384 


1275 


BL00024 


Hemopexin domain proteins. 


BL00024D 17.28 5.596e-09 353-384 


1 9*7*7 
III I 




/vriK repeat pro ie ins. 


PF00093A 16 03 1 600e-13 345-360 
PF00023B 14.20 6.3 18e-09 302-311 
PF00023A 16 03 6 464e-09 306-321 


1278 


BL00142 


Neutral zinc metallopeptidases, zinc-binding 

xCJ^iULL UlLHClxlo. 


BL00142 8.38 1.857e-09 412-422 


IZ / o 


xxxvU / JO 


MF1VTRR ANF AT ANYI DIPFPTIDASE 

1VJJU1VxJJXV/\I^IXh/ rVL^iAlN X X/ J-/ XX X-/X 1 XI_/n.kJXv 

(Ml) FAMILY SIGNATURE 


PR00756A 12 90 5 091e-17 245-260 
PR00756D 10.58 8.258e-17 412-427 
PR00756B 14.06 7.333e-14 297-312 
PR00756E 11.91 3.769e-09 431-443 


1279 


DM01688 


2 POLY-IG RECEPTOR 


DM01688K 17.19 8.640e-ll 78-116 
DM01688G 16.45 5.680e-09 76-107 


1288 


PR00019 


LEUCINE-RICH REPEAT SIGNATURE 


PR00019A 11.19 8.043e-10 164-177 
PR00019B 11.36 7.120e-09 136-149 


1288 


BL00240 


Receptor tyrosine kinase class III proteins. 


BL00240B 24.70 7.319e-09 319-342 


1290 


PR00019 


LEUCINE-RICH REPEAT SIGNATURE 


PR00019A 11.19 3.400e-12 86-99 
PR00019B 11.36 9.357e-12 83-96 
PROdfllOA 11 19 4 333e-09 1 1 1-124 

XXvV/v/v/ 1 11.17 t.JJJC-V/ Ul,"iit 


17Q5 


RTi)1113 


C^ln Hrvmain "nroteins 

V> X vl UVJXXXaiXX L/XUVI/XXIO. 


BL01113C 13.18 9.617e-13 159-178 
BL01113D 7.47 2.174e-ll 191-200 
BL01113B 18.26 7.658e-ll 91-126 
BL01113A 17.99 3.106e-10 22-48 


1295 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 9.769e-14 112-131 
PR00007C 15.60 5.688e-13 159-180 
PR00007D 9.64 1.419e-09 189-199 
PR00007A 19.33 4.429e-09 86-1 12 


1295 


PR00513 


5-HYDROXYTRYPTAMINE IB 


PR00513D 11.06 8.085e-09 50-67 
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RECEPTOR SIGNATURE 




1296 


PR00665 


OXYTOCIN RECEPTOR SIGNATURE 


PR00665D 9.93 9.012e-ll 108-124 


1296 


BL00896 


LacY family proton/sugar symporters 
proteins. 


BL00896A 14.92 2.552e-09 300-332 


1296 


PR00237 


RHODOPSIN-LKE GPCR SUPERFAMILY 
SIGNATURE 


PR00237F 13.57 8.667e-12 269-293 
PR00237G 19 63 7 395e-10 314-340 
PR00237A 11.48 8.333e-10 34-58 
PR00237B 13.50 4.250e-09 68-89 


1296 


BL00237 


O-nroteiri counled receiptors nroteins 


BL00237C 13 19 4 414e-12 264-290 
BL00237D 11.23 9.727e-09 324-340 


1297 


BL00019 


Artinin-fvne artfn-rirndiripr domain nroteins 


BL00019C 14 66 6 250e-28 285-320 
BL00019D 15 33 2 309e-15 348-377 
BL00019B 13.34 2.976e-13 240-262 
BL00019A 12.56 2.286e-12 215-225 


1297 


PF00435 


Spectrin repeat proteins. 


PF00435A 32.05 2.000e-14 991-1019 
PF00435B 13.41 9.609e-ll 1496-1511 
PF00435C 20.73 3.571e-09 2006-2025 


1297 


DM00588 


8 kw CH02 ALPHA ANTIGEN 
PARAMYOSIN. 


DM00588B 9.45 6.870e-09 1259-1268 


1297 


BL00326 


Tropomyosins proteins. 


BL00326B 7.68 9.296e-09 2110-2158 


1297 


BL00226 


Intermediate filaments nroteins 


BL00226B 23 86 5 605e-09 1734-1781 
BL00226B 23.86 9.895e-09 2042-2089 


1298 


BL00019 


Actinin-tvne actin-bindine domain nroteins 


BL00019C 14.66 6.250e-28 297-332 
BL00019D 15.33 2.309e-15 360-389 
BL00019B 13.34 2.976e-13 240-262 
BL00019A 12.56 2.286e-12 215-225 


1298 


PF00435 


Spectrin repeat proteins. 


PF00435A 32.05 2.O00e-14 1003-1031 
PF00435B 13.41 9.609e-ll 1508-1523 
PF00435C 20.73 3.571e-09 2018-2037 


1298 


DM00588 


8 kw CH02 ALPHA ANTIGEN 
PARAMYOSIN. 


DM00588B 9.45 6.870e-09 1271-1280 


1298 


BL00326 


Tropomyosins proteins. 


BL00326B 7.68 9.296e-09 2122-2170 


1298 


BL00226 


Intermediate filaments proteins. 


BL00226B 23.86 5.605e-09 1746-1793 
BL00226B 23.86 9.895e-09 2054-2101 


1304 


PR00700 


PROTEIN TYROSINE PHOSPHATASE 
SIGNATURE 


PR00700C 13.17 8.535e-09 125-142 


1305 


PR00700 


PROTEIN TYROSINE PHOSPHATASE 
SIGNATURE 


PR00700C 13.17 8.535e-09 240-257 


1306 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.433e-10 207-260 


1306 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 9.211e-10 428-446 
PR00020C 13.66 3.340e-09 509-520 


1306 


BL00740 


MAM domain proteins. 


BL00740B 19.76 4.682e-10 578-598 
BL00740A 13.87 5.588e-09 430-442 


1308 


BL00072 


Acyl-CoA dehydrogenases proteins. 


BL00072E 24.12 5.014e-12 724-766 
BL00072D 30.08 7.136e-10 635-685 


1309 


BL00072 


Acyl-CoA dehydrogenases proteins. 


BL00072E 24.12 5.014e-12 706-748 
BL00072D 30.08 7.136e-10 617-667 


1311 


PR00215 


NEUROMODULDM SIGNATURE 


PR00215C 13.98 6.779e-10 743-763 


1311 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412B 10.60 1.681e-09 735-771 


1311 


PF00992 


Troponin. 


PF00992A 16.67 9.746e-10 609-643 
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PF00992A 16.67 5.145e-09 613-647 
PF00992A 16.67 7.395e-09 615-649 
PF00992A 16.67 1.000e-08 608-642 


1314 


PF00632 


HECT-doraain (ubiquitin- transferase). 


PF00632C 20.66 1.000e-29 2270-2301 
PF00632B 18.45 2.800e-21 2215-2242 i 


1314 


PF00624 


Flocculin repeat proteins. 


PF00624J 6.21 7.000e-09 1424-1478 


1314 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 9.022e-10 350-400 
BL00412D 16.54 1.551e-09 342-392 
BL00412D 16.54 7.429e-09 349-399 
BL00412D 16.54 8.531e-09 328-378 


1314 


DM00191 


w SPAC8A4.04C RESISTANCE 
SPAC8A4.05C DAUNORUBICIN. 


DM00191D 13.94 6.635e-09 1410-1448 
DM00191D 13.94 9.374e-09 1404-1442 


1317 


DM00179 


w KINASE ALPHA ADHESION T-CELL. 


DM00179 13.97 5.263e-10 107-116 


1 191 




T FT TPTNF RTPTT PFPF AT ^TfrNATTIRF 


PR00019B 11 36 4 000e-ll 335-348 
PR00019B 11.36 1.450e-10 193-206 
PR00019B 11 36 3 250e-l0 167-180 
PR00019A 1 1.19 4.130e- 10 338-351 
PR00019A 11.19 4.522e-10 480-493 
PR00019B 11.36 7.300e-10 309-322 
PR00019B 11.36 1.720e-09 569-582 
PR00019B 1 1.36 3.880e-09 477-490 
PR00019A 11.19 5.667e-09 170-183 


1321 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 6.280e-09 568-587 
DM01551C 14.62 8.320e-09 355-374 


1322 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290B 13.17 9.250e-09 317-334 


1324 


PD01719 


PRECURSOR GLYCOPROTEIN SIGNAL 
RE. 


PD01719A 12.89 1.740e-ll 36-63 






proteins. 


BL00420B 22 67 4 696e-38 15-69 
BL00420B 22.67 6.949e-36 189-243 
BL00420B 22.67 1.300e-35 301-355 
BL00420B 22.67 4.358e-30 639-693 
BL00420B 22.67 1.863e-26 406-460 
BL00420C 11.90 1.360e-13 100-110 
BL00420C 11.90 6.797e-ll 274-284 
BL00420C 11.90 8.322e-ll 492-502 
BL00420C 11.90 1.545e-10 386-396 


1328 


PR00258 


SPERACT RECEPTOR SIGNATURE 


PR00258B 9.63 7.188e-15 654-665 
PR00258B 9.63 8.875e-15 30-41 
PR00258B 9.63 8.875e-15 204-215 
PR00258B 9.63 6.400e-14 316-327 
PR00258B 9.63 3.543e-13 421-432 
PR00258E 13.33 7.811e-13 99-111 
PR00258D 14.41 7.500e-ll 468-482 
PR00258E 13.33 9.625e-ll 273-285 
PR00258D 14.41 2.552e-10 700-714 
PR00258E 13.33 3.000e-10 491-503 
PR00258A 11.46 8.791e-10 635-651 
PR00258C 9.05 1.000e-09 45-55 
PR00258A U.46 2.375e-09 185-201 
PR00258A 11.46 6.500e-09 11-27 
PR00258A 11.46 6.500e-09 297-313 



WO 2004/080148 



PCT/US2003/030720 



356 

TABLE 3A 



SEQ 
ID 


Database 
entry DD 


Description 


Result* 








PR00258E 13.33 7.450e-09 385-397 
PR00258C 9.05 8.500e-09 436-446 
PR00258A 1 1.46 9.625e-09 402-418 


1329 


PD01270 


RECEPTOR FC IMMUNOGLOBULIN 
AFFIN. 


PD01270A 17.22 7.500e-15 21-60 
PD01270B 22.18 6.288e-13 72-108 
PD01270C 19.54 7.608e-09 114-142 


1333 


BL00246 


Wnt-1 family proteins. 


BL00246D 23.97 1.000e-40 202-254 
BL00246E 20.32 8.636e-35 319-364 
BL00246B 13.69 6.806e-29 101-135 
BL00246C 15.56 9.036e-22 167-191 
BL00246A 15.75 6.870e-21 68-87 


1335 


PR00245 


OLFACTORY RECEPTOR SIGNATURE 


PR00245A 18.03 7.300e-19 26-47 


1337 


BL00476 


Fatty acid desaturases family 1 proteins. 


BL00476C 13.87 1.000e-40 80-132 
BL00476E 12.10 1.000e-40 231-283 
BL00476D 11.28 2.125e-30 171-221 
BL00476B 18.34 4.494e-16 36-79 
BL00476F 12.75 6.333e-16 285-329 


1337 


PR00075 


FATTY ACID DESATURASE FAMILY 1 
SIGNATURE 


PR00075D 11.41 3.538e-33 131-160 
PR00075C 10.31 3.813e-20 94-114 
PR00075G 8.85 2.047e-19 268-282 
PR00075E 12.60 7.585e-l6 192-210 
PR00075F 16.07 6.952e-15 225-246 
PR00075A 16.97 4.429e-14 47-67 
PR00075B 12.16 7.047e-ll 71-93 


1339 


PD00301 


PROTEIN REPEAT MUSCLE CALCIUM- 
BL 


PD00301A 10.24 6.400e-09 55-65 


1339 


BL00422 


Granins proteins. 


BL00422C 16.18 6.647e-09 44-71 
BL00422C 16.18 8.235e-09 45-72 


1339 


BL00319 


Amyloidogenic glycoprotein extracellular 
domain proteins. 


BL00319C 17.12 5.836e-ll 48-81 
BL00319C 17.12 5.974e-09 47-80 
BL00319C 17.12 8.342e-09 44-77 
BL00319C 17.12 9.053e-09 45-78 


1340 


BL00406 


Actins proteins. 


BL00406C 6.75 4.286e-20 137-191 
BL00406B 5.47 8.130e-14 78-132 
BL00406D 12.58 3.734e-13 267-321 
BL00406A 9.95 1.290e-12 5-39 


1340 


PR00190 


ACTIN SIGNATURE 


PR00190F 7.80 4.803e-12 135-154 
PR00190C 11.49 1.878e-09 57-79 


1341 


BL00048 


Protamine PI proteins. 


BL00048 6.39 3.588e-09 4-30 


1343 


BL00790 


Receptor tyrosine kinase class V proteins. 


BL007901 20.01 9.520e-ll 555-585 


1343 


PR00014 


FIBRONECTIN TYPE m REPEAT 
SIGNATURE 


PR00014C 15.44 2.565e-09 544-562 


1344 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 5.776e-12 759-777 
PR00020C 13.66 6.932e-10 832-843 I 


1344 


PD01270 


RECEPTOR FC IMMUNOGLOBULIN 
AFFIN. 


PD01270D 24.66 5.378e-09 292-327 


1344 


BL00740 


MAM domain proteins. 


BL00740A 13.87 8.313e-12 761-773 
BL00740B 19.76 8.500e-09 901-921 


1344 


PD02080 


T-CELL GLYCOPROTEIN CD8 CHAIN 
SURFACE ALPHA PRE. 


PD02080B 20.69 9.621e-09 538-576 


1344 


BL00240 


Receptor tyrosine kinase class m proteins. 


BL00240B 24.70 9.809e-09 155-178 
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1345 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 6.577e-10 127-149 


1345 


BL00222 


Insulin-like erowth factor bindine oroteins. 


BL00222B 11.09 6.940e-10 74-89 


1345 


BL00621 


Tissue factor proteins. 


BL00621A 8.69 6.473e-09 5-22 


134fi 


PR 0iW6 


OTP 1 /ORG GTP-BINDING PROTEIN 

\J XX if \*/XJ \J VJ X A *-* x-J» * J—' XA. ™ \J L. IVw X 1— ' XX ' 

FAMILY SIGNATURE 


PR00326A 8.75 1.386e-09 85-105 


IJtU 


PF00Q99 

XT J7 W7ii 


Vp^inilnvini^ nVin^nhonrotein 


PF00922A 19.17 1.724e-09 437-470 


1346 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
STGNATTrRF 


PR00449A 13.20 1.931e-09 83-104 


134fi 




HYPOTHETICAL MYCOPLASMA 

xi x jl vy x x 1 '.■* x jlv_^/i « i xvx x v/x junu iyj/i 

LIPOPROTEIN fMG045 j SIGNATURE 


PR00905H 6 88 5.886e-09 343-363 


1348 


PR00406 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


PR00406F 3.97 3.520e-10 158-166 


1348 


PR00014 


FffiRONECTIN TYPE HI REPEAT 
SIGN ATI IRF 


PR00014B 14.77 2.500e-09 848-858 


1348 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 3.202e-09 480-512 


1348 


DM00179 


w KINASE ALPHA ADHESION T-CELL. 


DM00179 13.97 7.261e-09 205-214 


1348 


BL00240 


Receptor tyrosine kinase class HI proteins. 


BL00240B 24.70 8.277e-09 263-286 


1 Q/tQ 


rJJUZDZU 


K I. A v I A 1 VJts. t i\ 1 A_- 1 J Iv o V 7 1\ 

TRANSMEMBRANE. 


Pr)02520C 10 48 9 203e-09 881-897 


i 




P FT pr T A>J«I <2RG FAMTT V TNTFGR AT 

I^.XyXjIivJxTLiNo OXV.VJ JT/vLVillrf x 11> IxjrVJxVrVXv 

MEMBRANE PROTEIN SIGNATURE 


PR00698E 14 43 8 714e-09 97-122 


1:531/ 


15 T AAORA 
x>i^UUZo*f 


ocipiiLo proteino. 


BL00284C 28 56 5 714e-32 203-244 
BL00284D 16.34 9.640e-19 311-337 
BL00284A 15 64 1 783e-18 72-95 
BL00284B 17.99 3.045e- 16 176-196 
BL00284E 19.15 6.250e-14 378-402 






AnW T6*npflt nroteins 

/XXlXV IV^LJlsUl UX W Iv/XXXO • 


PF00023A 16.03 7.000e-l 1 69-84 
PF00023B 14.20 2.636e-09 131-140 


1 J J J 




RFPEAT PROTFIN ANK NUCLEAR 
ANKYR. 


PD00078B 13.14 2.957e-09 128-140 


1355 


PF00791 


Domain present in ZO-1 and Unc5-like netrin 
receptors. 


PF00791B 28.49 9.587e-09 69-123 


1356 


BL00107 


Protein kinases ATP-binding region proteins. 


BL00107A 18.39 4.000e-10 339-369 


1356 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109D 17.04 4.234e-09 403-425 
PR00109B 12.27 1.000e-08 339-357 


I J JO 


PR no? 17 


R HOD OPSTN-T TTCF GPCR SI JP ERF A MIL Y 

Xvll WU Wi O U 1 i^lXvw VJX V-/XV. U wX XwXVX XVLVXXX^ X 

SIGNATURE 


PR00237G 19 63 3 793e-13 41-67 


lJJO 


P.T 00237 


G-nrotein counled recentors nroteins 


BL00237D 11.23 3.348e-12 51-67 


1359 


BL00178 


Aminoacyl-transfer RNA synthetases class-I 
nroteins 


BL00178B 7.1 1 3.700e-12 344-354 


1360 


PF00969 


Class II histocompatibility antigen, beta 
domain proteins. 


PF00969A 22.07 5.846e-29 12-54 
PF00969B 9.97 6.211e-25 56-91 
PF00969C 27.72 7.324e-16 95-144 ! 


1361 


BL00520 


Interleukin-10 family proteins. 


BL00520A6.21 6.471e-09 1-13 


1362 


BL00520 


Interleukin-10 family proteins. 


BL00520A6.21 6.471e-09 1-13 


1365 


BL00253 


Interleiikin-1 proteins. 


BL00253D 25.67 3.464e-ll 95-134 


1365 


PR00264 


Il^RLEUKIN-1 SIGNATURE 


PR00264C 17.77 3.294e-17 95-123 
PR00264B 20.98 6.250e-09 56-82 


1366 


BL01177 


Anaphylatoxin domain proteins. 


BL01177E 20.64 4.541e-13 791-817 
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1366 


BL00477 


Alpha-2-macroglobulin family thiolester 
region proteins. 


BL00477J 19.04 7.207e-29 1221-1251 
BL00477F 17.34 8.500e-25 786-815 
BL00477G 19.43 8.826e-23 963-994 
BL00477A I*? 50 9 8G0e-23 122-150 
BL00477L 23.51 8.800e-22 1417-1449 
BL00477K 17.42 4.529e-14 1362-1385 
BL00477E 17.53 6.538e-13 756-776 
BL00477B 9.05 6.625e-13 209-221 
BL004771 18.76 2.650e-12 1065-1091 
BL00477D 12.73 4.073e-12 730-739 
BL00477H 9.07 5.395e-12 1034-1045 
BL00477C 15.70 1.161e-10 236-252 


1366 


BL00115 


Eukaryotic RNA polymerase II heptapeptide 
repeat proteins. 


BL00115V 21.32 5.745e-09 1402-1451 


1366 


BL00713 


Sodium:dicarboxylate symporter family 
proteins. 


BL00713F 16.13 8.989e-09 917-958 


1368 


BL00983 


Ly-6 / u-PAR domain proteins. 


BL00983C 12.69 8.714e-16 90-105 
BL00983B 8.19 2.161e-10 23-32 


1368 


BL00272 


Snake toxins proteins. 


BL00272C 8.27 9.791e-09 94-105 



* Results include in order: accession number subtype; raw score; p-value; position of signature in amino 
acid sequence 
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685 


IPB001400 


Somatotropin hormone family 


IPB001400A 14.85 1.90e-13 35-58 


686 


IPB001400 


Somatotropin hormone family 


IPB001400B 23.62 9.25e-24 79-115 
IPB001400A 14.85 4.33e-14 35-58 


686 


PR00836 


Somatotropin hormone family 
signature I 


PR00836A 15.53 1.96e-ll 79-92 
PR00836B 17.50 9.31e-ll 101-119 
IPB001400C 13.76 6.28e-10 135-151 


688 


IPB001400 


Somatotropin hormone family 


IPB001400B 23.62 1.90e-28 79-115 
IPB001400A 14.85 4.91e-16 35-58 


688 


PR00836 i 


Somatotropin hormone family 
signature II 


PR00836B 17.50 1.43e-15 101-119 
PR00836A 15.53 2.35e-13 79-92 
IPB001400C 13.76 4.72e- 10 135-151 


689 


IPB000215 


Serpins 


mDAAAl 1 CC 1 C ^ £Z C H 1 *7 111 1C\1 

IPBOOOzlDb 15.36 5.7oe-17 3/3-397 
IPB000215A 13.01 3.42e-15 77-100 
IPB000215D 15.35 8.05e-ll 294-320 
IPB000215B 9.87 6.04e-10 162-174 
IPB000215C 13.90 7.97e-10 189-203 


690 


PR00390 


Phospholipase C signature I 


PROO390A 14.24 6.34e-20 191-209 


690 


IPB002048 


EF-hand family 


IPB002048 7.91 3.84e-09 43-55 


691 


IPB000734 


Lipase 


IPB000734 10.25 8.50e-09 435-449 


693 


PR00573 


Interleukin 8B receptor signature III 


PR00573C 9.83 2.15e-09 38-46 


693 


PR00427 


Interleukin-8 receptor signature I 


PR00427A 15.48 4.46e-09 34-48 


694 


IPB000407 


GDA1/CD39 family of nucleoside 
phosphatase , 


IPB000407C 15.11 4.09e-19 217-239 
IPB000407D 11.44 4.27e-15 248-261 
1PB000407A 11.93 1.62e-ll 101-112 
IPB000407B 8.75 2.70e-ll 175-186 
IPB000407G 17.95 2.80e-ll 460-474 
IPB000407F 16.53 8.54e-l0 430-444 


695 


PR00237 


Rhodopsin-like GPCR superfamily 
signature VI 


PR00237F 14.34 3.20e-09 239-263 


695 


PRO 1066 


P2Y4 purinoceptor signature II 


PR01066B 4.51 6.03e-09 111-126 


696 


IPB001304 


C-type lectin domain 


IPB001304A 17.98 3.00e-17 168-192 


696 


PRO 1408 


Macrophage scavenger receptor 
signature VI 


r>rjni » nop C\ A OH. AA 0*5 TAT 

PR01408F 9.76 4.87e-09 83-107 


698 


IPB000407 


GDA 1/CD39 family of nucleoside 
phosphatase 


IPB000407C 15.11 3.30e-16 165-187 
irr>uuu4U/ij 1 1.44 y. j ye- 1 j iyo-zuy 

rpRHHA/lATR 8 7^ Q £R#» 19 19^ X'XA 
lrDUUU4U/D O. /O y.Ooe-lZ IZj-1j4 

IPB000407A 11.93 4.50e-10 48-59 

TPRnfWWl7P 16 ^ 7 S7p 1ft V77-1Q1 


*7AA 

700 


TT1DAAA/1 11 


z.z. z,inc ringer 


TPRftftflAI'* 14 104 ^Oa-1 1 1 Rd-9ftn 


TAA 

700 


AD AA/CAQ 


Class II cytochrome C signature I 


PP.nn^flRA 19 7S R ft7#»-1fl 1 18-141 


700 


fDDAAAt AO 

IrBOUOlUZ 


Neuraxin / MAP IB repeat 


TPRnftft109A 1ft ^ft ^ ^Q<» ftQ 116-144 


700 


IPB002989 


Mycobacterial pentapeptide repeats 


IPB002989B 10.80 5.76e-09 110-135 


700 


PRO 1286 


Orphan nuclear receptor NORl 
signature V 


r»r> A 1 OO^TJ C AT *7 1/4,-. A A 111 1 C /I 

PR01286E 5.27 7.14e-09 133-154 


700 


PR00456 


Ribosomal protein P2 signature V 


PR00456E 3.08 8.64e-09 123-137 


700 


rnnAAi 1 1 A 

IPB0011 19 


S-layer protein (SLH domain) 


PR00456E 3.08 9.69e-09 122-136 


700 


IPB001005 


Myb DNA binding domain 


IPB001005A 11.39 9.7le-09 231-251 


701 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 1.00e-09 280-294 


701 


PR01217 


Proline rich extensin signature VIII 


PR01217H 5.61 1.67e-09 309-321 


702 


IPB000345 


Cytochrome c family heme-binding 
site 


IPB000345 9.03 7.19e-09 107-119 


703 


1PB001251 


Cellular retinaldehyde-binding 
protein (CRAL)/Triple function 
domain (TRIO) 


IPB001251A 7.40 5.05e-12 38-49 
IPB001251B 14.78 7.14e-12 195-209 
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703 


rKUOloO 


lulnr «-Qf-it-i<ilHp»Vi\//HA hirudin or 

Cellular rennaiuenyae-Dinaing 
protein signature I 


PR00180A 1119 6 94e-l 1 37-59 
PR00180D 13.13 1.92e-09 202-221 


704 


IPB002610 


Rhomboid family 


IPB002610C 5.81 3.81e-10 284-294 
IPB002610B 5.33 6.81e-09 225-235 


705 


PR01256 


Otxl transcription factor signature II 


ppnio^/cn ^ oo k q^o 1 1 ooi o^^ 
rivuizjotJ o.yz D.y/e-ii zzi-z«>j 

PR01256B 5.92 2.35e-10 219-231 
PR01 956ft ^ 99 9 1 le-09 220-232 
PR01256B 5 92 2 31e-09 222-234 
PR01256B 5.92 2.62e-09 217-229 


705 


IPB001541 


SUR2-type hydroxylase/desaturase 
catalytic domain 


IPB001541B 11.65 3. 14e-09 223-232 
IPB001541B 11.65 3.14e-09 224-233 
IPB001541B 11.65 3. 14e-09 225-234 
IPB001541B 11.65 6.57e-09 222-231 


705 


PR00910 


Luteovirus ORF6 protein signature I 


PR00910A 2.74 9.04e-09 756-768 


706 


IPB001124 


Lipid-binding serum glycoprotein 


IPB001124D 21.85'2.50e-12 251-287 
IPB001124C 25.71 5.08e-ll 184-227 


707 


IPB002495 


Glycosyltransferase family 8 


IPB002495B 11.16 4:77e-09 273-283 


708 


IPB001781 


LIM domain 


lrlSUU 1 /o 1 l l.**Z o. / /e- 11 j L-*f 1 . 


710 


IPB001442 


C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442F 15.05 l.00e-40 1624-1667 
IPB001442C 14.98 4.82e-40 1537-1571 
EPB001442A26.12 4.09e-39 1298-1350 
IPB001442A 26.12 5.40e-35 114-166 
IPB001442D 15.34 1.00e-34 1572-1603 
IPB001442A 26.12 7.11e-29 799-851 

IPRflfilAAOA 0£ 10 1 47p-9R 781-ftt^ 
lrljUU l^KfZ/Y ZO. 1Z i.*t/e-ZO /Ol-OJJ 

IPB001442A 26.12 3.48e-28 790-842 
IPB001442A 26.12 4.57e-28 814-866 


710 


IPB000885 


Fibrillar collagen C-terminal domain 


IPB000885B 19.15 1.93e-27 1339-1392 
IPB000885B 19.15 2.24e-27 783-836 
IPB001442A 26.12 2.53e-27 683-735 
1PB001442A 26.12 3.59e-27 796-848 

TPftfinflRR^R iq K 4 06p-97 780-8"^ 

IPB001442A 26.12 4.8ie-27 925-977 
IPB001442A 26.12 5. 


710 


IPB001073 


Complement Clq protein 


1PB00 1073 A 22. 14 9. 1 8e- 1 9 14 1 3-1447 
IPB000885A 1 1.46 9.29e-19 744-781 
IPB000885B 19.15 9.40e-19 1348-1401 
IPB000885B 19.15 9.40e-19 1412-1465 
rpPfifll /LA 0 ) a ")ft 10 9 49p-19 5^8-590 

IPB001442A 26.12 9.42e-19 1304-1356 
IPB000885B 19 


710 


IPB000817 


Prion protein 


IPB000817A 8.34 7.23e-10 777-819 
IPB000885A 11.46 7.26e-10 1064-1101 
IPB001442B 12.38 7.30e-10 735-755 
IPB001442B 12.38 7.30e-10 938-958 

IPB001442A 26.12 7.36e-10 582-634 
IPB001073A 22.14 7.4 


710 


IPB001285 


Synaptophysin/synaptoporin 


IPB001285F 6.39 4.08e-09 1379-1423 
IPB000885B 19.15 4.11e-09 462-515 
IPB000885B 19.15 4.11e-09 1087-1140 
IPB001442B 12.38 4.28e-09 103-123 
IPB000885A 11.46 4.31e-09 612-649 
IPB000885B 19.15 4.35e-09 1213-1266 
IPB001442B 12.38 


710 


IPB003778 


DUF183 


IPB003778B 27.11 7.31e-09 302-344 
IPB001442B 12.38 7.32e-09 794-814 
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IPB001442A 26.12 7.34e-09 629-681 

TT»T> AAAOOCY5 1 A 1 C *7 OO — rtA f AO 1 

IPB000885B 19.15 7.38e-09 598-651 
IPB001442A 26.12 7.42e-09 444-496 
IPB001073A 22.14 7.47e-09 975-1009 
IPB000885B 19.15 7.5 


710 


IPB003531 


Short hematopoietin receptor family 
1 


IPB003531C 15.87 9.76e-09 518-535 

TnilrtrtftO 1 O A O 1/1 ft oi« Aft O AA 1C 1 

1PB000817A 8.34 9.ole-U9 309-35 1 
IPB000885B 19.15 9.84e-09 1451-1504 
IPB001442B 12.38 9.88e-09 302-322 
EPB000817A8.34 9.91e-09 1026-1068 
IPB000885B 19.15 1.00e-08 658-711 


711 


PR00261 


Low density lipoprotein (LDL) 
receptor signature II 


PR00261B 15.12 4.13e-22 1101-1122 
PR00261C 18.72 2.87e-21 1015-1036 

n'nrtAI/' 1 T» 1 C t r\ A A f Ol 1A1C 1 AT^" 

PR00261B 15.12 4.46e-21 1015-1036 
PR00261E 18.625.74e-21 1144-1165 
PR00261B 15.12 1.32e-20 3523-3544 


711 


IPB000033 


"Low-density lipoprotein (ldl) 
receptor, YWTD repeat" 


IPB000033D 30.18 2.03e-20 2057-2095 
PR00261B 15.12 2.61e-20 892-913 
PR00261A 15.49 2.73e-20 1053-1074 
PR00261D 16.87 6.40e-20 892-913 
PR00261B 15.12 6.46e-2Q 1053-1074 
PR00261F 15.46 7.92e-20 892-913 
PR00261D 16.87 8.56e-20 3 


711 


IPB002172 


Low density lipoprotein (LDL)- 
receptor class A (LDLRA) domain 


IPB002172 7.37 1.00e-16 2818-2830 
PR00261F 15.46 2. 10e-16 1 185-1206 
PR00261D 16.87 2. 15e-16 3721-3742 
PR00261A 15.49 2.38e-16 2729-2750 
PR00261D 16.87 2.38e-16 933-954 
PR00261E 18.62 2.97e- 16 2729-2750 
PR00261F 15.46 3.41e-16 


711 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 6.14e-16 206-221 
PR00261C 18.72 7.57e-16 2729-2750 
PR00261A 15.49 7.92e-16 3562-3583 
PR00261F 15.46 8.02e-l6 2729-2750 
PR00261C 18.72 8.30e- 16 3600-3621 
IPB000033A 21.82 8.33e-16 2731-2753 
PR00261B 15.12 8.53e- 16 933-954 
PR00261F 15.46 8.68e-16 3562-3583 
PR00261C 18.72 9.27e-16 80-101 
PR00261F 15.46 9.56e-16 3404-3425 
PR00261E 18.62 9.72e-16 3562-3583 
PR00261C 18.72 9.76e-16 2938-2959 
PR00261E 18.62 1.53e-15 3404-3425 
PR00261E 18.62 1.53e-15 3484-3505 
PR00261D 16.87 1.63e-15 3641-3662 
PR00261C 18.72 1.68e-15 3484-3505 
PR00261E 18.62 i.79e-15 3809-3830 
PR00261D 16.87 1.84e-15 3364-3385 

rKUUzulA lD.4y Z.Zye-l J Z/O/-Z/00 

IPB002172 7.37 2.64e-15 89-101 
PR00261F 15.46 2.80e-15 2687-2708 
PR00261E 18.62 3.12c- 15 3523-3544 
PR00261C 18.72 3.25e-15 3641-3662 


711 


PR00764 


Complement C9 signature II 


PR00764B 12.47 3.36e-15 1048-1068 
IPB002172 7.37 3.45e-15 1110-1122 
PR00261B 15.12 3.74e-15 3600-3621 
PR00261B 15.12 4.33e-15 2893-2914 
PR00261C 18.72 4.60e-15 2687-2708 
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IPB000033C 11.58 4.81e-15 3128-3142 
IPB000033D 30.18 5. 


711 


IPB001774 


Delta serrate ligand 


IPB001774D 19.23 9.89e-14 4240-4286 
IPB002172 7.37 1.00e-13 2902-2914 
PR00261E 18.62 1.00e-13 2558-2579 
rKUUZolD 16.87 1.53e-13 1185-1206 
PR00261C 18.72 1.96e-13 125-146 
PR00261F 15.46 2.19e-13 2558-2579 
IPB000033C 11.58 2.29e-13 1376-1390 
PR00261B 15. 12 2.53e-13 2558-2579 
IPB002172 7.37 2.59e-13 1062-1074 
IPB002172 7.37 2.59e-13 2947-2959 
IrfcJUUZl fz 1.51 J.lze-li 2ool-2o/3 
PR00764B 12.47 3.38e-13 3636-3656 
IPB002172 7.37 3.65e-13 3650-3662 


711 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881B 12.28 4.00e-13 206-217 
PR00261A 15.49 4.60e-13 1185-1206 
IPB000152 8.86 5.09e-l3 3019-3034 
PR00261B 15.12 5.25e-13 3444-3465 

rKUU/Olti lo.OZ !>.Ole-l3 1ZD-140 

IPB002172 7.37 5.76e-13 2776-2788 
IPB002172 7.37 6.29e-13 


711 


PR00010 


Type II EGF-like signature III 


PR00010C 6.98 8.13e-ll 211-221 
IPB002172 7.37 8.43e-ll 3532-3544 
IPB000033A 21.82 8.71e-ll 1187-1209 
IPB000033C 11.58 9.00e-li 1774-1788 
IPB000152 8.86 9.04e-ll 2979-2994 
PR00261C 18.72 9.18e-il 2558-2579 
IPB000033C 11.58 5.86e-10 2081-2095 


711 


PR00907 


Thrombomodulin signature II 


PR00907B 11.50 6. 04e- 10 4218-4234 
IPB000033C 11.58 6.40e-10 41 1-425 
IPB002172 7.37 6.54e-10 942-954 
IPB000033C 11.58 6.58e-10 1466-1480 
IPB000033A 21.82 7.26e-10 2560-2582 
PR00261C 18.72 7.67e- 10 2893-2914 
PR00010C 6.98 8.55e-10 3024-3034 
PR00764B 12.47 8.62e-10 120-140 

rKvU/04r5 o. /je-lU ZoU4-ZoZ4 

PR00764B 12.47 8.85e-10 3439-3459 

IroUUZl/Z /.J/ y.o le-lU J4i/jO0Uj 

IPB000033C 11.58 9.46e-10 3084-3098 
IPB002172 7.37 1.00e-09 2647-2659 
PR00764B 12.47 1.22e-09 3479-3499 
IPB000033C 11.58 1.48e-09 736-750 
PR00764B 12.47 1.65e-09 2594-2614 

PR00764B 12.47 2.63e-09 75-95 


/ll 




"Developmental signaling protein, 
Wnt-1 family 1 ' 


ToiinnriO'7nT7 11 ai a io^ no aha\ ziooo 
lrr>UUUy/Ur Zo.4j 4. iye-vjy 4Z4l-4Zoy 

IPB000033C 11 58 4 21e-09 2404-2418 


711 


PR00873 


Echinoidea (sea urchin) 
metallothionein signature IV 


PR00873D 8.25 4.88e-09 4326-4344 
PR00764B 12.47 5.23e-09 2933-2953 
IPB000033D 30.18 5.37e-09 4044-4082 
PR00764B 12.47 5.66e-09 2553-2573 
PR00764B 12.47 5.99e-09 3518-3538 
IPB001881B 12.28 6.87e-09 2979-2990 


711 


IPB001169 


"Integrin beta, C-terminus" 


IPB001169K 27.45 6.96e-09 2547-2589 
IPB000033C 11.58 7.91e-09 1647-1661 


711 


IPB002557 


Chitin binding domain 


IPB002557B 12.64 7.92e-09 1236-1249 
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IPB000033C 11.58 8.07e-09 367-381 
Tpnnfififtiir* 1 1 & 9i*» no 1 190. 1 141 

IPB001774C 18.25 8,26e-09 4301-4343 
PR00010C 6.98 8.46e-09 3900-3910 


711 


IPB003886 


Extracellular domain in nidogen 


IPB003886D 13.91 8.62e-09 206-225 
PR00764B 12.47 8.70e-09 1096-1 1 16 

IPB003886E 12.94 8.88e-09 4100-4110 
PR00764B 12.47 9.46e-09 2847-2867 


711 


IPB000118 


Granulin 


IPB000118C 7.41 9.65e-09 3822-3863 
PR00907B 11.50 9.66e-09 162-178 


712 


PR00261 


Low density lipoprotein (LDL) 
receptor signature II 


r»T>/"lA1/Tl T> 1C 1*> "7 >0« 1 O OA 1 A1 

PRUOzolb ij.lz /.4Je-lo olMUl 
PR00261D 16.87 7.25e-17 80-101 
PR00261E 18.62 3.53e-16 80-101 
PR00261F 15.46 5.39e-16 80-101 
PR00261A 15.49 6.08e-16 80-101 


712 


IPB002172 


Low density lipoprotein (LDL)- 
receptor class A (LDLKA) domain 


IPB002172 7.37 2.64e-l0 o9~lUi 

BD A AO< 1p lO *70 1 Ala. 1 < Bfl 1 fi 1 

rKUUZOll^ lo. /Z J.4/e-lj oU-lUl i 

PR00261A 15.49 7.64e-15 125-146 

ppaa9£1 r? i je; /i£ q qa 0 i c i 9< i a/; 

PR00261D 16.87 L98e-14 125-146 


712 


XT>Vif\f\f\f\11 

IrBUOUUiJ 


"Low-density lipoprotein (Idl) 
receptor, YWTD repeat" 


TPRnnnn^^A 91 R9 1 ^i*»..izi s?-if)4 
irrjuuuuj j/v zi.oz j.jje-i*t oz-iuh 

PR00261C 18.72 L96e-13 125-146 

PR00261E 18.62 5.61e-13 125-146 

IPB002172 7.37 6.40e-12 134-146 

PR00261B 15.12 9.37e-l2 125-146 


712 


PR00764 


Complement C9 signature II 


PR00764B 12.47 8.62e-10 120-140 


712 


rjr> AAA A"7 


Thrombomodulin signature II 


PPPifiQfT7R 11 WO ^^p-ftQ 1^9-1 7R 

PR00764B 12.47 1.00e-08 75-95 


713 


IPB003164 


Alpha adaptin carboxyl-terminal 
domain 


IPB003164M 10.25 8.22e-09 164-195 


714 


PR00205 


Cadherin signature VI 


PR00205F 19.57 3.86e-l6 741-767 
PR00205F 19.57 2.13e-l5 301-327 
PR00205B 20.09 7.30e-15 996-1025 

nn AAOACD OA AO O 7H« 1 K 9<fi 77Q 

rKUuzujD zu.uy y. /ue-o zou-z/y 
PR00205B 20.09 1.84e- 14 475-504 
PR00205D 12.22 4.12e-14 332-351 


714 


IPB002126 


Cadherin domain 


IPB002126B 12.04 4.79e-14 238-255 
PR00205B 20.09 4.94e-14 1210-1239 
PR00205B 20.09 7.19e-14 1315-1344 

PPfWV)fi<in 19 99 Q 11a \A 1904 1111 

rKUvJzuojJ iz.zz y.j ie-14 iz!/h-ij1j 
IPB002126B 12.04 3.57e-13 463-480 
PR00205F 19.57 4.90e-13 1368-1394 
IPB002126B 12.04 5.29 


715 


PR00205 


Cadherin signature VI 


PR00205F 19.57 3.86e-l6 741-767 
PR00205F 19.57 2.13e-15 301-327 
PR00205B 20.09 7.30e-15 996-1025 1 

PRnn90^R 90 OO 0 70p-1 S 9S0-97Q 

PR00205B 20.09 1.84e- 14 475-504 
PR00205D 12.22 4. 12e- 14 332-351 


715 


IPB002126 


Cadherin domain 


IPB002126B 12.04 4.79e-14 238-255 
PR00205B 20.09 4.94e-14 1210-1239 
PR00205B 20.09 7.19e-14 1315-1344 
PR00205D 12.22 9.31e-14 1294-1313 
IPB002126B 12.04 3.57e-13 463-480 
PR00205F 19.57 4.90e-13 1368-1394 
IPB002126B 12.04 5.29 
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716 


IPB002469 


"Dipeptidyl peptidase IV, N- 
terminus" 


IPB002469I 10.99 4.86e-16 719-737 ! 
IPB002469H 21.17 6.14e-16 674-709 
IPB002469J 8.97 3.52e-12 801-817 


716 


IPB002471 


Prolyl endopeptidase family serine 
active site 


IPB002471B 24.90 3.66e-ll 706-737 
IPB002469G 26.76 9.24e-l 1 629-667 


717 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 1.00e-21 156-181 
IPB000822 14.67 4.75e-19 324-349 
IPB000822 14.67 4.46e-18 212-237 
IPB000822 14.67 3.57e-17 184-209 
IPB000822 14.67 7.43e-17 240-265 
IPB000822 14.67 1.00e-16 296-321 
IPB000822 14.67 2.69e-15 62-87 
IPB000822 14.67 4.38e-15 352-377 


717 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 8.20e-15 181-194 


111 


IPB001275 


DM DNA binding domain 


IPB001275 19.17 9.07e-15 172-211 
PR00048A 9.94 3.77e-14 321-334 
PR00048A 9.94 8.62e-14 349-362 
PR00048A 9.94 3.57e-13 153-166 
IPB001275 19.17 9.71e-13 144-183 
IPB000822 14.67 L95e-12 268-293 
PR00048A 9.94 2.06e-12 237-250 
PR00048A 9.94 4.18e-12 209-222 1 
IPB000822 14.67 9.53e-12 34-59 
PR00048A 9.94 6.21e-ll 265-278 
IPB001275 19.17 8.71e-ll 200-239 
PR00048A 9.94 1.41e-10 293-306 
IPB001275 19.17 4.16e-10 312-351 
PR00048B 5.52 5.50e-10 197-206 
PR00048A 9.94 7.55e-10 59-72 
PR00048B 5.52 9.36e-10 337-346 
PR00048B 5.52 l.OOe-09 169-178 
PR00048B 5.52 3,50e-09 225-234 
EPB001275 19.17 3.62e-09 ; 256-295 
PR00048B 5.52 4.50e-09 365-374 
IPB001275 19.17 5.22e-09 228-267 
IPB001275 19.17 8.75e-09 284-323 


718 


IPB000221 


Protamine PI 


IPB000221 5.48 2.97e-12 74-100 
IPB000221 5.48 9.30e-12 63-89 
IPB000221 5.48 2.19e-ll 103-129 
IPB000221 5.48 2.59e-ll 64-90 
IPB000221 5.48 3.91e-ll 78-104 


718 


IPB000492 


Protamine 2 (PRM2) 


IPB000492B 5.26 5.88e-ll 98-132 
IPB000221 5.48 6.16e-ll 92-118 
IPB000221 5.48 6.43e-ll 99-125 
IPB000221 5.48 7.62e-ll 60-86 
IPB000492B 5.26 9.35e-ll 79-113 
IPB000492B 5.26 9.35e-ll 102-136 
IPB000221 5.48 2.73e-10 118-144 
IPB000221 5.48 4.70e-l0 62-88 
IPB000221 5.48 4.70e-l0 94-120 
IPB000492B 5.26 6.97e-10 103-137 
IPB000492B 5.26 8.12e-10 106-140 
IPB000492B 5.26 8.53e-10 105-139 
IPB000221 5.48 8.89e-10 101-127 
IPB000492B 5.26 9.06e-10 78-112 
IPB000492B 5.26 9.69e-10 100-134 
IPB000221 5.48 l.OOe-09 83-109 
IPB000221 5.48 1.46e-09 65-91 
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IPB000221 5.48 3.31e-09 109-135 
IPB000221 5.48 3.31e-09 122-148 
IPB000492B 5.26 3.84e-09 75-109 
IPB000221 5.48 5.15e-09 107-133 
IPB000221 5.48 5.27e-09 52-78 


718 


PR00055 


HIV TAT domain signature III 


PR00055C9.12 5.92e-09 16-32 
IPB000221 5.48 6.19e-09 116-142 
IPB000492B 5.26 6.38e-09 94-128 
IPB000492B 5.26 6.67e-09 107-141 
IPB000221 5.48 6.88e-09 97-123 
IPB000221 5.48 6.88e-09 111-137 
IPB000492B 5.26 7.75e-09 77-111 
IPB000492B 5.26 8.34e-09 65-99 


718 


IPB000271 


Ribosomal protein L34 


IPB000271 15.87 9.78e-09 111-148 
IPB000221 5.48 9.88e-09 124-150 

TTiDflrtAilftOD C ~\C A C\(\~ AA 111 1 A C 

IPB000492B 5.26 9.90e-09 111-145 
IPB000221 5.48 1.00e-08 76-102 


720 


IPB000152 


Aspartic acid and asparagine 
hydroxyiation site 


TtyQAAAl CO C QC £. KAa. 1 1 11A Q OKO 

IrJt>0U015z o.oO o.j4e-l / ZJ4o-zJ0j 
IPB000152 8.86 4.18e-15 2191-2206 

ttjtjaaai CO C fi/C 1 QAa \A 000,0 00/1*7 
lrtJUUUOZ o.oO 3.o4e-i4 ZZ3Z-ZZ4/ 

IPB000152 8.86 3.86e-13 2108-2123 


720 


IPB003886 


Extracellular domain in nidogen 


TDDflAIOQ/Cn 11 Ol A 7Sq n OOIO OOC1 

IrrJUUjooOLI ij.yl 4. /oe-U ZzJz-ZZjI 


720 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881B 12.28 5.50e-13 2191-2202 


720 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.29e-13 1028-1065 


720 


PR00010 


Type II EGF-like signature III 


PROOOIOC 6.98 9.47e-13 2353-2363 
IPB003006B 20.23 1.00e-12 1119-1156 


720 


IPB000033 


"Low-density lipoprotein (Idl) 
receptor, YWTD repeat" 


IPB000033B 7.05 3.70e-12 2196-2206 
IPB001881B 12.28 5.20e- 12 2348-2359 


720 


IPB002861 


Reeler domain 


IPB002861B 10.50 6.52e-12 1435-1463 
IPB002861B 10.50 7.12e-12 1606-1634 


720 


PR01303 


Plasmodium circumsporozoite 
protein signature IV 


PR01303D 10.57 7.20e-12 1441-1458 
PROOOIOC 6.98 1.75e-il 2196-2206 
IPB000152 8.86 1.96e-ll 2023-2038 
IPB001881B 12.28 4.79e-ll 2232-2243 
IPB003006B 20.23 4.91e-U 386-423 
IPB003006B 20.23 5,30e-ll 1208-1245 
IPB002861B 10.50 7.08e-ll 1549-1577 
IPB003006B 20.23 8.43e-ll 199-236 
IPB001881B 12.28 8.58e-ll 2066-2077 
IPB001881B 12.28 9.53e-ll 2023-2034 

mnAA^AAfn on 0*1 A ^ 1 _ 1 1 tc^ OAO 

IPB003006B 20.23 9.6le-ll 756-793 


720 


IPB000981 


Neurohypophysial hormone 


IPB000981A 17.34 1.60e-10 1594-1621 
IPB003006B 20.23 2.08e-10 847-884 
IPB003886D 13.91 2.33e-10 2191-2210 
IPB000033B 7.05 4.48e-10 2353-2363 


720 


IPB003367 


Thrombospondin type 3 repeat 


IPB003367A 11.78 5.83e-10 21 16-2136 

nnAnAiA iA c*7 c nn« ia i/cio iicm 

PR01303D 10.57 5.90e-10 lolz-1629 
IPB000033B 7.05 7.10e-10 2113-2123 


720 


IPB001862 


Membrane attack complex 
components/perforin/complement C9 


IPB001862A 12.54 8.02e-10 1714-1729 


720 


PR00907 


Thrombomodulin signature VII 


PR00907G 10.43 8.09e-10 2348-2374 
IPB003006B 20.23 8.56e-10 104-141 
IPB001881B 12.28 8.71e-10 2108-2119 
PR00907G 10.43 8.85e-10 2232-2258 
IPB003006B 20.23 8.92e-10 938-975 
IPB003886D 13.91 9.41e-10 2348-2367 
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PR00907B 11.50 9.64e-10 2228-2244 
IPB003006B 20.23 1. 3 5e-09 479-516 
PR01303D 10.57 2.00e-09 1726-1743 


720 


PRO 1472 


Intercellular adhesion 
moiecuie/vascuiar cen aunebiun 
molecule- 1 signature III 


PR01472C 14.40 3.41e-09 994-1009 


1 L\J 


lrDUUlOOI 


Dvrr-iiKe aomain 


PR00010C 6.98 3.63e-09 2113-2123 
IPB003006B 20.23 3.77e-09 1299-1336 
IPB000033A 21.82 4.35e-09 2053-2075 
IPB002861B 10.50 4.48e-09 1663-1691 
IPB003006B 20.23 4.81e-09 10-47 
IPB003367A 11.78 5.13e-09 2318-2338 
IPB003006B 20.23 5.50e-09 572-609 




Dl>ft1 /I HA 

rj\\) 14/4 


vascular cen acnesion moiecuie-i 
y v v^aivi- i ) signature v i 


PPA1A74P 14 81 ^ 7Af»_AQ 1991 19^4 
rKUI4/4r 14.01 J./Oe-Ul/ 1ZZ1-1ZJ4 

TPRnn^nn/>R 9n 9^ ^ no 90^ ^n 


720 


PR01536 


Interleukin-l receptor type I and type 
II family signature III 


PR01536C 19.92 5.85e-09 393-416 
PR01536C 19.92 6.08e-09 1126-1149 
PR01536C 19.92 7.46e-09 763-786 

PRA1 10 09 7 SRp-AO 191 S-19^8 

PR00010C 6.98 8.02e-09 2237-2247 
IPB001862A 12.54 8.55e-09 1486-1501 
IPB002861B 10.50 8.98e-09 1720-1748 
IPB002861C 23 17 9 02e-09 1650-1704 


720 


FPB000967 




IPB000967E 21 88 9 20e-09 1443-1483 


720 


IPB000118 


Granulin 


IPB00OU8B 7.94 9.20e-09 2011-2049 
PR00907O 10 43 9 27e-09 2108-7134 
PR00907B 1 1.50 9.43e-09 2344-2360 
IPB002861B 10.50 9.59e-09 1492-1520 


721 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D 2.13 8.05e-14 71-95 
IPB000135D 2.13 5.27e-13 72-96 
IPB000135D 2.13 9.46e-12 73-97 
IPB000135D 2.13 4.78e-ll 70-94 


721 


IPB003874 


CDC45-like protein 


IPB003874C 5.49 8.27e-ll 74-85 


721 


IPB000897 


GTP-binding signal recognition 

narti/»1<=» ^^l?P^^ Hnmain 

pdiiiLrie y&ssjrJ 1 *) uonidin 


IPB000897A 9.15 8.60e-ll 454-473 
TpRnoonsn? n 3 o<jp-i 0 74-98 


721 


IPB001580 


Calreticulin family 


IPB001580F2.93 8.3 le-10 78-87 
IPB000135D 2.13 9.02e-10 69-93 
IPB000135D 2.13 1.00e-09 65-89 
TPRnnissnp 9 i 4^<» no 7£-Rs 

IPB000135D 2.13 5.09e-09 66-90 

ii D\j\JiJO\jr £.yj u.ojc >-\jy /h-oj 
IPB000135D 2 13 7 00e-09 75-99 
IPB000135D 2.13 8.00e-09 68-92 
IPB000135D 2.13 9,36e-09 63-87 


722 


IPB001140 


ABC transporter transmembrane 
region 


IPB001140A21.73 8.36e-20 1311-1357 
TPR001 140 A 21 73 9 29e-18 499-54S 
IPB001140B 15.62 4.79e-15 615-653 
IPB001140B 15.62 1.16e-l0 1427-1465 


722 


PR00326 


GTP1/OBG GTP-binding protein 
family signature I 


PR00326A 8.70 6.66e-10 513-533 


722 


IPB000795 


GTP-binding elongation factor 


IPB000795A 10.67 7.88e-10 1324-1339 


722 


IPB000897 


GTP-binding signal recognition 
particle (SRP54) domain 


IPB000897A 9. 1 5 1 .54e-09 5 12-53 1 
IPB000795A 10.67 2.85e-09 512-527 
PR00326A 8.70 4.49e-09 1325-1345 
IPB000897A9.15 5.57e-09 1324-1343 


722 


IPB001324 


Phosphoribulokinase family 


IPB001324A 18.12 8.00e-09 1321-1342 
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799 
/ZZ 




u is ease resistance proiem signature i 


PRflfl^fvlA 8 9Q 8 AA*» AO ^19 <97 

rivuujiwA o.zy o.uue-uy oizoz/ 


799 

/ZZ 


PRA1A14 


in euro peptide x z receptor signature 

VT 

V 1 


PPfll A14F K OO o 7A/> HO £A7 

riwiviftr ij. zz o. /4e-uy O4/-003 


723 


PR01217 


Proline rich extensin signature VTI 


PR01217G 4.02 7.16e-09 242-267 

PR01 917H 4 <J7 7 4Qf>-AQ 4Q<; ^16 


723 


IPB001084 


Microtubule associated Tau protein 


IPB001084C 7.66 9.64e-09 308-325 


723 


IPB001101 


Plectin repeat 


IPB001101K 8.53 9.92e-09 29-72 


724 


EPB001552 


Acyl-CoA dehydrogenase 


IPB001552E 22.77 2.46e-19 158-198 
ttjraai 9/i qq k ac^ i o /co i ao 

IPB001552C 25.04 7.75e-15 13-53 


/ZD 


irrjuuuyys 


MAM domain 


lrrJUUUyyolJ lo.OO l.yoe-15 526-549 


725 


IPB003886 


Extracellular domain in nidogen 


IPB003886D 13.91 8.77e-15 236-255 


725 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


EPB000152 8.86 2.89e-14 109-124 


725 


IPB001881 


Calcium-binding EGF-hke domain 


IPB001881B 12.28 5.00e-14 191-202 

TDDAAA1 CI O OH 1 AA„ 11 11^ *1C1 

lrt>0Uui5z o.oo 1.00e-13 236-251 
IPB000152 8.86 1.82e-13 191-206 

TDRAA1 QfilD io no / 1^1 AO 1 OA 
lirtSUUloolD IZ.Zo 4. /De-13 lUy-lZU 


/ZD 


fDQnni '7*7yl 

irrJUUl / /4 


— — 

Delta serrate ligand 


TDQAA1 in AC* Ifi OC O 1 la 1 1 *71 110. 

lrtJUUl / /4C lo.ZD y. Ue-lJ / 1-1 13 

TPRAAAQQ8R 17 9A 1 AA*» 19 AA0_/191 
irD\jyj\jyyoD l/.ZU l.UUe-lZ 4Uy~4Zl 


725 


PR00020 


MAM domain signature I 


PR00020A 20.48 2.88e-l 1 407-425 
IPB000998C 18.63 5.30e-ll 463-478 
IPB001881B 12.28 8.58e-ll 236-247 


/ZD 


dd aaqat 


Thrombomodulin signature II 


I>T> AAOA*7R 1 1 <A O 1 A 1/11 1 KQ 

ricuuyu/D ii.ju z.44e-iu 143-1 jy 


725 


IPB000561 


EGF-like domain 


IPB000561 4.89 3.25e-10 80-88 


/ZD 




"Low-density lipoprotein (ldl) 
receptor, i w 1 u repeat 


lriJUUUU33D /.Uj D.3je-lU Z41-Z0 1 
TPDAAAAUD *7 ^ 07n AO 1 Q£ 9A£ 

irD\)\)\)\)jot> /.id j.y/e-uy lyo-zuo 


79 S 

/ZD 


tpraaai fn 


Dehydrin 


fPRAAA1£7A 8 58 7 AO 791 7^A 
lri3UUUlO/A 0.35 /.14e-Uy jZj-jjU 


725 


IPB003367 


Thrombospondin type 3 repeat 


IPB003367A 11.78 9.79e-09 158-178 


/Zo 


trbUUiZoo 


x Tt_rr — — 

NHL repeat 


lrfc>UUlZjoD Zo.Ol 4.3Ue-l/ Oiy-Oj3 
.IPB001258B 28.61 7.00e-17 525-559 

TPRAA195RR 9R A1 1 97*» 1 /111 
TPR0A19SRR 98 ^ 01p-lfi 47R-S19 


726 


PR01406 


B-box zinc finger signature I 


PR01406A 20.90 8.36e-12 1 12-129 

IPRAA19^RR 9R fil S £\fte> 1 1 ^79-606 


796 

/ ZU 




R_R r\ y f^-tprm i nQ 1 nnmain 
D DUA Vy-tCLIlllIla.1 UUllid.ll I 


TPRAA^fid-QR 99 lfi 7 fi8p_lft 1 IS- 174 i 


726 


IPB001869 


Thiol-activated cytolysins 


IPB001869C 15.61 6.06e-09 396-419 


797 
/Z / 


rpRnnnio« 

uDUUU li/o 


tviiovjrVJr uuriidin 


TPRAAAIORP 1 (\ 40 8 7 1 l^O97_04A 
irDuvjoiyo^/ io.*ty o. o ie-10 yzj-ytu 

IPB000198B 12.47 9.10e-15 833-850 


727 


IPB002219 


Phorbol esters/diacylglycerol binding 
uomain 


IPB002219B 12.53 3.89e-ll 724-739 

TPRAAAIORA K Q A1a 10781 707 

irouuvJiyoA ij.yj y.oie-iu /oi-/y/ 


111 


IPB002551 


Coronavirus SI glycoprotein 


IPB002551J 18.56 3.60e-09 470-511 


727 


IPB001369 


Purine and other phosphorylases 
family 2 


IPB001369C 24.81 4.27e-09 36-76 


727 


IPB003351 


Dishevelled specific domain 


IPB003351C 13.82 7.24e-09 1025-1064 


729 


LPB002870 


Reprolysin family propeptide 


IPB002870B 24.73 6.23e-24 131-169 
IPB002870F 18.81 6.54e- 16 456-480 


79 Q 
IZy 


JLrrSUUl /OZ 


Disintegnn 


TDOAA17AOA T5 Q1 <A Q 1« I^Q lOQ 

lrrJuui /ozA Z3.y3 o.jue-iD 3jy-3yy 
IPB002870E 11.90 8.67e-14 414-426 
IPB002870D 16.31 8.77e-13 383-398 


729 


PR01303 


Plasmodium circumsporozoite 
protein signature IV 


PR01303D 10.57 1.42e-ll 1173-1190 
PR01303D 10.57 1.40e-10 1488-1505 
IPB002870A 12.22 2.29e-10 81-97 
IPB002870C 11.01 2.80e-10 344-354 
PR01303D 10.57 3.91e-10 1098-1115 


729 


IPB000130 


"Neutral zinc metallopeptidases, 
zinc-binding region" 


IPB000130 5.86 7.19e-10 412-422 
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IrDUUUllo 


Granulin 


TPpnnm i so io ie a no i ati.i ^io 

IrDUUUlloLr 1Z. 18 4. 5 ie-Uy 14/1-13 1^ 


729 

- 


IPB002861 


Reeler domain 


IPB002861C 23.17 5.34e-09 969-1023 

DPAl iniT^ 1 n <*7 /C CA n no 1 ooo 19A£ 

TPRnn^R^iR in i n^t> no 1 991-19^1 

IrDUuZODlD 1U.JU /./je-Uy lZiJ"UJl 


729 


PR00269 


Pleiotrophin/midkine family 
signature I 


PR00269A 12.42 9.33e-09 1162-1186 


730 


PR01478 


Leukotriene B4 type 2 receptor 
signature V 


PR01478E 5.85 7.56e-10 149-177 


735 


IPB001331 


Guanine-nucleotide dissociation 
stimulators CDC24 family 


IPB001331C 16.09 7.35e-14 302-327 


737 


IPB002004 


"Poly-adenylate binding protein, 
unique domain" 


IPB002004C 13.84 8.14e-l0 189-231 


741 


PR01276 


Type II keratin signature II 


PR01276B 9.79 9.27e-10 147-159 


742 


PR00205 


Cadherin signature II 


PR00205B 20.09 5.95e-20 252-281 
PR00205D 12.22 3.25e-16 654-673 
PR00205B 20.09 7.60e-15 142-171 
PR00205F 19.57 1.00e-14 520-546 
PR00205G 13.05 1.37e-13 657-674 
PR00205F 19.57 3.10e-13 623-649 
PR00205D 12.22 5.80e-13 231-250 
PR00205D 12.22 5.80e-13 551-570 
PR00205B 20.09 6.40e-13 469-498 


742 


IPB002126 


Cadherin domain 


IPB002126B 12.04 8.71e-13 560-577 
PR00205F 19.57 1.26e-12 308-334 
PR00205G 13.05 1.30e-12 340-357 
PR00205G 13.05 4.90e-12 554-571 
PR00205D 12.22 5.37e-12 337-356 
PR00205D 12.22 8.20e-12 448-467 
PR00205G 13.05 8.50e- 12 234-251 
PR00205G 13.05 6.84e-ll 451-468 
IPB002126B 12.04 7.43e-ll 240-257 
PR00205F 19.57 7.63e-ll 417-443 
PR00205A 17.38 8.56e-ll 301-320 
IPB002126B 12.04 3.03e-10 457-474 
IPB002126B 12.04 9.42e-10 130-147 
IPB002126A 14.68 3.67e-09 312-328 
PR00205A 17.38 4.71e-09 5 13-532 
PR00205E 10.82 5.50e-09 570-583 
IPB002126A 14.68 6.33e-09 204-220 
PR00205C 13.59 6.62e-09 640-652 
PR00205B 20.09 7.06e-09 572-601 
PR00205D 12.22 8.27e-09 121-140 
PR00205G 13.05 9.82e-09 124-141 


744 


IPB001862 


Membrane attack complex 
components/perforin/complement C9 


IPB001862C 26.48 8.94e-09 119-167 


745 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 6.00e-24 216-241 
IPB000822 14.67 9.18e-21 160-185 
IPB000822 14.67 1 .75e-20 328-353 

TDDnAAOOO 1 A H"l A rtrt« 1/\ CIO C A*i 

lrBUUUoZZ 14.07 4.00e-20 518-543 
IPB000822 14.67 8.50e-20 244-269 
IPB000822 14.67 9.25e- 19 490-515 
IPB000822 14.67 7.92e-18 188-213 
IPB000822 14.67 9.31e-18 356-381 
IPB000822 14.67 9.36e-17 272-297 
IPB000822 14.67 3.40e-16 384-409 
IPB000822 14.67 8.80e-16 300-325 


745 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 5.50e-15 381-394 
PR00048A 9.94 1.00e-14 269-282 
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PR00048A 9.94 1.00e-14 543-556 
PR0G048A 9.94 3.08e-14 185-198 
PR00048A 9.94 4.46e-14 487-500 
IPB000822 14.67 6.06e- 14 440-465 
IPB000822 14.67 2.50e-13 412-437 
PR00048A 9.94 3.57e-13 297-310 
PR00048A 9.94 6.79e-13 213-226 
PR00048A 9.94 7.43e-13 409-422 
IPB000822 14.67 8,00e-13 132-157 


745 

•- 


IPB001275 


DM DNA binding domain 


IPB001275 19.17 8.00e-13 148-187 
PR00048A 9.94 3.12e-12 241-254 
PR00048A 9.94 5.76e-12 515-528 
PR00048B 5.52 7.00e-12 173-182 
IPB001275 19.17 7.58e-12 204-243 

IPB001275 19.17 3.96e-l 1 506-545 
IPB000822 14.67 4.43e-ll 546-571 
IPB001275 19.17 5.76e-ll 176-215 
PR00048A 9.94 6.21e-ll 325-338 
PR00048B 5.52 7.00e-ll 341-350 
PR00048B 5,52 9.25e-ll 503-512 
PR00048B 5.52 1.00e-10 229-238 
IPB001275 19.17 1.49e- 10 344-383 
IPB001275 19.17 4,41e-10 316-355 


745 


rr%T>f\Ai oil 

IPB001222 


Irllo zinc riDDon aomain 


TPRfim 999 94 tfv* S 1 0 490-526 
TPP.001975 19 17 5 50e- 10 232-271 
PR00048A 9.94 7.14e-l0 129-142 
PR00048A 9 94 7 14e-10 157-170 
PR00048A 9 94 1 38e-09 437-450 
IPB001275 19.17 1.46e-09 372-411 
IPB001275 19.17 3.39e-09 288-327 
PR00048B 5.52 5.50e-09 531-540 
IPB001222 24.63 8.35e-09 160-196 
IPB001275 19.17 9.09e-09 260-299 


745 


IPB001142 


Yeast membrane protein DUP 


IPB001 142B 22.92 9.60e-09 290-335 


745 


IPB002867 


Cysteine-rich domain (C6HC) 


IPB002867C 19.46 9.76e-09 129-146 
PR00048B 5.52 1.00e-08 313-322 


746 


rnnnAI GAG 


IvKAd dox 


FPR001909 17 37 8 65e-30 37-71 


746 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 6.00e-24 291-316 
IPB000822 14.67 9.18e-21 235-260 
IPB000822 14.67 1.75e-20 403-428 
IPB000822 14.67 8.50e-20 319-344 
IPB000822 14.67 7.92e-18 263-288 
IPB000822 14.67 9.3 le- 18 431-456 
IPB000822 14.67 9.36e-17 347-372 
IPB000822 14.67 3. 40e- 16 459-484 


746 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 5.50e-15 456-469 
PR00048A 9.94 1.00e-14 344-357 
PR00048A 9.94 3.08e-14 260-273 
IPB000822 14.67 6.06e-14 515-540 
1PB000822 14.67 2.50e-13 487-512 
PR00048A 9.94 3.57e-13 372-385 
PR00048A 9.94 6.79e-13 288-301 
PR00048A 9.94 7.43e-13 484-497 
IPB000822 14.67 8.00e-13 207-232 


746 


IPB001275 


DM DNA binding domain 


IPB001275 19.17 8.00e-13 223-262 
PR00048A 9.94 3.12e-12 316-329 
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• 


PR00048B 5.52 7.00e-12 248-257 
IPB001275 19.17 7.58e- 12 279-318 
PR00048A 9.94 8.41e-12 428-441 
IPB001275 19.17 5.76e-ll 251-290 
PR00048A 9.94 6.21e-ll 400-413 
PR00048B 5.52 7.00e-ll 416-425 
PR00048B 5.52 1.00e-10 304-313 
IPB001275 19.17 1.49e- 10 419-458 
IPB001275 19.17 4.41e-10 391-430 
IPB001275 19.17 5.50e-10 307-346 
PR00048A 9.94 7.14e-10 204-217 
PR00048A 9.94 7.14e-10 232-245 
PR00048A 9.94 1.38e-09 512-525 
IPB001275 19.17 1 .46e-09 447-486 
IPB001275 19.17 3.39e-09 363-402 


746 


IPB001222 


TFIIS zinc ribbon domain 


IPB001222 24.63 8.35e-09 235-271 
IPB001275 19.17 9.09e-09 335-374 


746 


IPB001142 


Yeast membrane protein DUP 


IPB001142B 22.92 9.60e-09 365-410 


746 ! 


IPB002867 


Cysteine-rich domain (C6HC) 


IPB002867C 19.46 9.76e-09 204-221 
PR00048B 5.52 1.00e-08 388-397 


747 


IPB000348 


emp24/gp25L/p24 family 


IPB000348B 26.69 5.33e-31 143-188 
IPB000348A 15.21 3.63e-12 78-96 


748 


IPB000560 


Histidine acid phosphatase 


IPB000560 17.02 1.00e-16 31-53 


749 


PR00405 


HIV Rev interacting protein 
^ipnature II 


PR00405B 10.10 8.29e-19 558-575 
PR00405C 18.05 9.55e-19 579-600 
PR00405A 18.83 4.00e-18 539-558 


749 


IPB000906 


ZU5 domain 


IPB000906G 25.85 4.32e-12 827-875 
IPB000906D 23.89 7.43e-09 846-900 \ 


751 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 7.00e-24 753-778 


751 


IPB001909 


KRAB box 


IPB001909 17.37 2.86e-21 344-378 
IPB000822 14.67 3.57e-17 695-720 
IPB000822 14.67 3.25e-14 605-630 
IPB000822 14.67 9.44e-14 781-806 
IPB000822 14.67 2.50e-13 723-748 


/J I 




P^H^-tvne zinc finffer signature I 


PR00048A 9.94 3.37e-ll 602-615 
PR00048A 9.94 4.32e-U 778-791 
PR00048A 9.94 5.26e-ll 692-705 
IPB000822 14.67 6.14e-ll 633-658 
PR00048A 9.94 9.53e-ll 750-763 
PR00048B 5.52 1.00e-10 766-775 
PR00048A 9.94 3.86e-10 720-733 


751 


IPB001580 


Calreticulin family 


IPB001580F 2.93 1.00e-09 514-523 
PR00048A 9.94 6.25e-09 630-643 
PR00048B 5.52 6.50e-09 708-717 


751 


PR01073 


Presenilin 1 signature III 


PR01073C 1.45 6.62e-09 509-520 


751 


IPB001275 


DM DNA binding domain 


IPB001275 19.17 8.18e-09 769-808 


751 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D 2.13 8.45e-09 507-531 


753 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 8.1 le- 14 261-275 


753 


PR00364 


Disease resistance protein signature 
IV 


PR00364D 14.89 4.60e-09 103-119 


753 


PR00019 


Leucine-rich repeat signature II 


PR00019B 11.42 8.91e-09 154-167 


754 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599L 18.66 7.84e-26 1244-1271 
IPB001599F 18.95 7.00e-24 785-814 
IPB001599H 18.42 6.40e-20 1019-1046 
IPB001599A 10.97 9.69e-l8 123-141 
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IPB001599N 24.85 2.24e-14 1437-1469 


754 


IPB001134 


"Netrin, C- terminus" 


IPB001134C 17.82 4.13e-13 1257-1271 
IPB001599M 13.29 4.71e-13 1384-1395 
IPB001599G 13.87 8.94e-13 987-996 
IPB001599B 7.45 4.89e-12 209-221 
IPB001599D 11.61 6.90e-12 728-738 
IPB001599J 20.99 3.00e-ll 1085-1110 
IPB001599I 10.83 7.60e-ll 1054-1063 
IPB001599K8.15 1.46e-10 1214-1225 
IPB001599C 14.40 3.55e-09 236-252 
IPB001599E 11.06 9.77e-09 755-764 


755 


IPB002181 


Fibrinogen beta and gamma chains 
C-terminal globular domain 


IPB002181E 27.75 4.44e-21 344-376 
IPB002181D 29.18 5.14e-19 298-338 
IPB002181F 18.85 2.13e-14 398-421 
IPB002181C 15.87 5.78e-12 280-292 
IPB002181A 18.44 2.32e-10 244-260 


756 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 5.30e-ll 457-494 


756 


PR00014 


Fibronectin type III repeat signature 
IV 


PR00014D 15.12 5.26e-10 671-685 
IPB003006B 20.23 5.68e-10 174-211 
IPB003006B 20.23 5.68e-10 275-312 


756 


PR00406 


Cytochrome B5 reductase signature 
VI 


PR00406F 4.29 6.03e-09 140-148 j 


756 


IPB003866 


Isoflavone reductase 


IPB003866D 19.80 9.48e-09 454-506 


757 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 6.85e-13 240-254 


757 


PR00019 


Leucine-rich repeat signature I 


PR00019A 11.72 7.14e-ll 149-162 
PR00019B 11.42 8.00e-10 98-111 
PR00019B 11.42 7.55e-09 122-135 
PR00019B 11.42 8.09e-09 146-159 


757 


IPB002889 


WSC domain 


IPB002889B 11.76 8.97e-09 599-645 


757 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 9.31e-09 335-372 
IPB002889B 11.76 9.44e-09 598-644 


758 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 6.85e-13 240-254 


758 


PR00019 


Leucine-rich repeat signature I 


PR00019A 11.72 7.14e-ll 149-162 
PR00019B 1 1.42 8.00e-10 98-11 1 
PR00019B ll.42 7.55e-09 122-135 
PR00019B 11.42 8.09e-09 146-159 


758 


IPB002889 


WSC domain 


IPB002889B 11.76 8.97e-09 603-649 


758 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 9.31e-09 335-372 
IPB002889B 11.76 9.44e-09 602-648 


759 


IPB000203 


GPS domain 


IPB000203 A 1 8.40 9.25e-20 966-996 
IPB000203B 13.98 8.88e-15 1086-1107 


759 


IPB000832 


G-protein coupled receptors family 2 
(secretin-like) 


IPB000832C 19.53 9.46e-13 1086-1115 


759 


PR00249 


Secretin-like GPCR superfamily 
signature III 


PR00249C 15.44 1.73e-10 1088-1111 
IPB000832G 15.17 7.81e-09 1256-1281 


760 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 4.00e-24 277-302 | 
IPB000822 14.67 3.45e-21 361-386 
IPB000822 14.67 1.75e-20 193-218 
IPB000822 14.67 3.25e-19 109-134 
IPB000822 14.67 4.00e- 19 389-414 
IPB000822 14.67 8.50e-19 165-190 
IPB000822 14.67 1.00e-18 249-274 
IPB000822 14.67 5.85e-18 305-330 
IPB000822 14.67 1.60e-16 137-162 
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IPB000822 14 67 3 40e-16 333-358 1 
IPB000822 14.67 5.50e-15 221-246 ! 


7<n 
/ou 


rivUUU*fo 


/"'O I-JO lima fintTPT* ciimiitiirp T 

i^zriz-iype zinc linger signature i 


PR00048A 9 94 S4e-14 110-141 


/OU 


TPRftftl 77<J 
lrJtJUUlZ / J 


JLTVi UlNA. Dinuillg UUlUdUl 


IPB00127S 19 17 6 5 5e- 14 217-276 
IPB001275 19.17 8.05e-14 321-360 
IPB001275 19 17 8 20e-14 153-192 
IPB001275 19.17 2.l4e-13 349-388 
IPB001275 19.17 4.57e-13 265-304 
PR00048A 9.94 4.86e-13 218-231 
PR00048A 9.94 4.86e-13 274-28 


760 


IPB002867 


Cysteine-rich domain (C6HC) 


IPB002867C 19.46 8.11e-09 274-291 
PR00048A 9.94 8.12e-09 358-371 


7£ft 
/OU 


TpRnft?fii/i 


Rnl A -lilff* nrrvtein 

O U IzV 1 1 l\.v piVJlwill 


IPB002634A 23 30 8 25e-09 298-332 ! 


7Aft 
/OU 




IfilrDa pa ni 1 1 r» virus serine nrrvtease 

(S35) signature VI 


PR00995F 16 50 9 73e-09 3 1 1-329 1 


7£1 


PRftftl?! 


^ nH i nm /nnt » s si l im -fran snnrtin f? 
ATPase si pn attire iV 


PR00121D 16.73 7.12e-15 173-194 


-761 


IPB001757 


E1-E2 ATPases _ 


IPB001757B 13.64 9.65e-13 588-617 
IPB001757A 14.16 4.18e-12 179-190 


761 


PR00119 


P-type cation-transporting ATPase 
superfamily signature II 


PR00119B 12.03 9.61e-12 180-194 


761 

/O 1 


TPRfiOOl so 


l"*rvF nrntpi n 

V^Ul ^lvflwlll 


IPB000150C 20.72 7.47e-09 595-627 


763 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 8.88e-09 172-197 


/Of 


TPRftftl lift 


rxi l ^riioiiuiiic u lau/ laiuiiy 


IPB001310A 18 76 3 25e-18 177-207 
IPB001310B 21.00 2.93e-12 241-267 


764 


PR00332 


Histidine triad family signature II 


PR00332B 14.02 6.26e-10 189-207 


767 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D 2.13 4.52e-10 101-125 
IPB000135D 2.13 9.7le-10 103-127 
IPB000135D 2.13 9.90e-10 100-124 
IPB000135D 2.13 3.18e-09 104-128 
IPB000135D 2.13 9.55e-09 102-126 


768 


PR00074 


Protein-Iysine 6-oxidase precursor 
signature V 


PR00074E 11.34 9.46e- 14 327-347 
PR00074B 7.56 4.98e-12 260-284 


-768 


IPB001695 


Lysyl oxidase 


IPB001695E 9.12 5.70e-12 244-285 


768 




Speract receptor signature IV 


PPftft9SRn 1d. 90 7 1Qp-1? 04-108 
PR00758F 14 0611Re-ll 117-129 
PR00258A 13.56 1.54e- 10 29-45 
PR00074D 21 66 2 94e-10 305-326 
PR00258A 13.56 3.70e-10 139-155 
PR00258C 9.05 4.95e-10 177-187 
PR00258D 14 29 6 29e-10 210-224 
PR00258C 9.05 9.34e-10 63-73 
PR00258B 7.94 6.14e-09 48-59 
IPB001695F 11.10 6.87e-09 285-313 


771 
/ / 1 


IF DvU 1 VrO*T 


N^iprrrfiihi iIp accori?jtp/i Taii nrntein 
IVilw UIUUU1C aDoULlalCAl lall ^JlUL^lll 


IPB001084C 7.66 L00e-08 105-122 


771 


TPR000174 

11 D\J\J\JJ 1 *t 


PHnsnhatiHnte rvridvlvltratisfprase 


IPB000374B 15.86 2.06e-27 358-385 
IPB000374A 12.59 3.65e-l6 254-266 


774 


PR00320 


G nrotein beta WD-40 reoeat 
signature I 


PR00320A 13.15 7.95e-ll 190-204 
PR00320B 12.82 2.08e-10 190-204 
PR00320C 12.32 4.33e-09 190-204 


775 


IPB001422 


Neuromodulin (GAP-43) 


IPB001422C 16.82 1.95e-10 155-190 


775 


IPB001990 


Granins (chromogranin or 
secretogranin) 


IPB001990C 33.59 8.01e-10 150-197 


776 


IPB002549 


Domain of unknown function DUF20 


IPB002549B 19.59 9.27e-09 229-266 


778 


IPB002884 


Proprotein convertase P-domain 


IPB002884B 15.69 6.33e-09 114-131 


779 


IPB000361 


Hypothetical hesB/yadR/yfhF family 


IPB000361B 19.14 3.08e-19 119-150 
IPB000361A 17.83 2.71e-16 70-90 
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780 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20,23 9.28e-10 131-168 


783 


IPB002223 


Pancreatic trypsin inhibitor (Kunitz) 
family 


IPB002223 17.66 3.88e-25 556-590 


783 


IPB000885 


Fibrillar collagen C-terminal domain 


IPB000885A 11.46 5.57e-19 13-50 


783 


IPB001442 


C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442A 26.12 6.26e-19 6-58 
IPB001442A 26.12 4.44e-18 3-55 
IPB001442A26.12 3.17e-17 185-237 
IPB001442A 26.12 3.60e-17 191-243 
IPB000885B 19.15 5.72e-17 2-55 
IPB000885B 19.15 6.29e-17 11-64 
IPB001442A 26.12 7.51e-17 12-64 
IPBOO 1442A 26.12 1.21e-16 197-249 
IPB000885B 19.15 2.19e-16 193-246 
IPB001442A 26.12 3.51e-16 9-61 

IrrJUUUooJA ll.*fO D.UOe-lO iyo-Z3J 

IPB001442A 26.12 6.02e-16 188-240 
IPB000885B 19.15 7.83e-16 8-61 
IPB000885A 1 1.46 L61e-15 19-56 
IPB000885B 19.15 3.65e-15 202-255 
IPB000885B 19.15 4.39e- 15 184-237 
IPB000885B 19.15 4.49e-15 190-243 
IPB000885B 19.15 8.09e-15 17-70 
IPB001442A 26.12 9.29e-15 182-234 
IPB001442A 26.12 9.80e-15 15-67 


783 


PR00453 


Von Willebrand factor type A 
domain signature I 


PR00453A 11.78 1.75e-14 265-282 
IPB000885A 11.46 2.29e-14 201-238 
IPB000885A 11.46 3.92e-14 210-247 
IPB000885B 19.15 6.76e-14 14-67 
IPB000885B 19.15 6.97e-14 187-240 
IPB000885A 11.46 7.08e-14 22-59 
IPB001442A 26.12 7.65e-14 200-252 
IPB000885B 19.15 7.78e-14 5-58 
IPB001442A 26.12 8.63e-14 203-255 
irt>uuuooj/v i i.no v. / fc-iH zj-oz 
IPB001442A 26.12 1.00e-13 194-246 
IPB000885A 11.46 l.44e-13 10-47 
IPB000885A 11.46 2.89e-13 195-232 
IPB001442B 12.38 4.67e-13 60-80 
IPB000885A 11.46 6.33e-13 207-244 
IPB000885B 19.15 7.07e-13 196-249 

IPB000885B 19.15 7.46e-13 199-252 
IPB001442B 12.38 1.31e-12 22-42 


783 


IPBOO 1073 


Complement Clq protein 


IPBOO 1073 A 22.14 1.36e-12 56-90 
IPB001073A22.14 1.72e- 12 203-237 
IPB001073A 22.14 2.80e-12 119-153 
IPB000885A 11.46 2.93e-12 7-44 
IPBOO 1442A 26 12 5 05e-12 24-76 
IPB000885A 11.46 5.93e-l2 213-250 
IPB000885A 11.46 6.04e-12 20 


783 


PR00759 


Basic protease (Kunitz-type) 
inhibitor family signature III 


PR00759C 12.43 6.28e-l 1 575-590 
IPB001073A 22.14 7.00e-ll 59-93 
IPB000885A 11.46 7.57e-ll 28-65 
IPB001073A 22.14 8.17e-ll 142-176 
IPB001073A 22.14 8.33e-ll 50-84 
IPB001073A22.14 8.67e-ll 15-49 
1PB001442B 12.38 8.71e-ll 37-57 
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783 


IPB000817 


Prion protein 


IPB000817A 8.34 9.70e-10 132-174 
PR00759B 12,35 9.72e-10 565-575 
IPB001442A 26.12 9.92e-10 30-82 
IPB000885A 11.46 1.83e-09 189-226 
IPB001442B 12.38 1.97e-09 210-230 
IPB001073A 22.14 2.27e-09 128-162 
IPB000885B 19.15 2.47e-09 


784 


IPB001541 


SUR2-type hydroxylase/desaturase 
catalytic domain 


IPB001541A 12.30 5.50e-ll 164-176 
IPB001541B 11.65 4.86e-09 251-260 


784 


IPB001369 


Purine and other phosphorylases 
family 2 


IPB001369A 12.23 8.71e-09 2-15 


785 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 4.96e-10 367-404 
IPB003006B 20.23 6.19e-09 1589-1626 


786 


PR00918 


Calicivirus non-structural polyprotein 
family signature I 


PR00918A 13.81 3.59e-12 27-47 


786 


IPB000I35 


High mobility group proteins HMG1 
and HMG2 


IPB000135D 2.13 4.25e-12 186-210 
IPB000135D 2.13 9.24e-12 187-21 1 
IPB000135D 2.13 6.42e-ll 188-212 
IPB000135D 2.13 1.68e-10 185-209 


786 


IPB002078 


Sigma-54 factor interaction protein 
family 


IPB002078A 20.43 6.31e-10 33-67 


786 


PR00364 


Disease resistance protein signature I 


PR00364A 8.29 7.1 le-10 32-47 


786 


IPB000765 


GTP1/OBG family 


IPB000765 26.91 7.67e-10 31-74 


786 


IPB000897 


GTP-binding signal recognition 
particle (SRP54) domain j 


IPB000897A 9.15 8.26e-10 393-412 


786 


1PB001580 


Calreticulin family 


IPB001580F 2.93 8.3 le-10 200-209 
IPB001580F 2.93 9.44e-10 201-210 


786 


IPB000623 


Shikimate kinase 


IPB000623A 19.06 1.64e-09 394-423 


786 


IPB000619 


Guanylate kinase 


IPB000619A 18.08 1.86e-09 394-411 
IPB001580F2.93 1.90e-09 199-208 


786 


PR00094 


Adenylate kinase signature I 


PR00094A 9.62 2.43e-09 34-47 


786 


PR00830 


Endopeptidase La (Lon) serine 
protease (SI 6) signature I 


PR00830A 8.52 4.50e-09 37-56 


786 


IPB001482 


Bacterial type II secretion system 
protein E 


IPB001482B 12.05 4.60e-09 390-412 
IPB000135D 2.13 4.73e-09 191-215 


786 


IPB000850 


Adenylate kinase 


IPB000850C 18.89 5.03e-09 149-179 
IPB000135D 2.13 6.00e-09 190-214 


788 


PR00452 


SH3 domain signature II 


PR00452B 1 1 .47 6.03e-09 14-29 i 


789 


PR00452 


SH3 domain signature II 


PR00452B 11.47 6.03e-09 87-102 


790 


IPB001820 


Tissue inhibitors of 
metalloproteinases 


IPB001820C 11.81 1.56e-15 73-85 i 
IPB001820B 10.75 2.44e-14 54-64 
IPB001820D 16.18 9.10e-14 91-105 
IPB001820A8.17 2.52e-ll 16-29 


791 


IPB001304 


C-type lectin domain 


IPB001304A 17.98 3.00e-17 149-173 


791 


PR01408 


Macrophage scavenger receptor 
signature VI 


PR01408F 9.76 4.87e-09 64-88 


792 


IPB002213 


UDP-glucoronosyl and UDP- 
glucosyl transferase 


IPB002213 27.73 3.37e-40 276-322 


794 i 


IPB000339 


ubiE/COQ5 methyltransferase family 


IPB000339D 24.04 6.07e-14 146-188 1 


794 


PR00508 


S21 class N4 adenine-specific DNA 
methyltransferase signature II 


PR00508B 17.31 3.88e-09 167-187 


794 


IPB000682 


Protein-L-isoaspartate(D-aspartate) 
O-methyltransferase 


IPB000682C 16.46 6.79e-09 68-92 


795 


PR00237 


Rhodopsin-hke GPCR superfamily 
signature III 


PR00237C 14.77 1.30e-12 508-530 
PR00237B 12.45 8.62e-12 463-484 
PR00237D 9.76 3.37e-l 1 544-565 


795 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11.56 2.42e-10 522-533 
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795 


PR01157 


P2 purinoceptor signature IV 


PR0U57D 16.03 2.98e-09 662-674 


795 


PR00173 


Glutamate-aspartate symporter 
signature VI 


PR00173F 10.23 9.45e-09 705-724 
PR00237F 14.34 9.56e-09 645-669 


799 


PR01539 


Interleukin-1 receptor type II 
precursor signature DC 


PR015391 14.65 9.06e-09 162-185 


802 


IPB000U7 


Kappa casein 


IPB000117D 10.18 8.71e-09 506-540 


805 


IPB000171 


Bacterial-type phytoene 
dehydrogenase 


IPB000171E 7.19 8.20e-09 29-39 


806 


IPB001774 


Delta serrate Hgand 


IPB001774D 19.23 5.91e-09 50-96 


806 


IPB000034 


Laminin B 


IPB000034C 12.97 7.31e-09 84-102 


806 


IPB000561 


EGF-like domain 


IPB000561 4.89 8.07e-09 84-92 


807 


IPB001774 


Delta serrate ligand 


IPB001774D 19.23 5.91e-09 50-96 


807 


IPB000034 | 


Laminin B 


IPB000034C 12.97 7.31e-09 84-102 


807 


IPB000561 


EGF-like domain 


IPB000561 4.89 8.07e-09 84-92 


808 


IPB001774 


Delta serrate ligand 


IPB001774D 19.23 5.91e-09 50-96 


808 


IPB000034 


Laminin B 


IPB000034C 12.97 7.31e-09 84-102 


808 


IPB000561 


EGF-like domain ! 


IPB000561 4.89 8.07e-09 84-92 


809 


PR00436 


Interleukin-8 signature I 


PR00436A 15.20 9.36e-10 14-37 


810 


IPB001187 


Tissue Factor (TF) 


IPB001 187G 15.20 7.00e-10 40-76 


Oil 


ITDVv IUJ7 


"Major histocompatibility complex 
protein, Class I" 


IPB001039B 27.55 8.79e-09 98-149 


812 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.71e-12 113-150 
IPB003006B 20.23 9.14e-12 406-443 
IPB003006B 20.23 1.00e-ll 213-250 


812 


PR01536 


Interleukin-1 receptor type I and type 
II family signature III 


PR01536C 19.92 9.23e-l 1 512-535 
IPB003006B 20.23 6.40e-10 19-56 
IPB003006B 20.23 9.64e-10 505-542 
IPB003006B 20.23 8.62e-09 311-348 
PR01536C 19.92 9.19e-09 120-143 


813 


IPB003006 


Immunoglobulin and major 

fiiQtnrnfnnatihilitv comnlex domain 


IPB003006B 20.23 8.71e-12 428-465 
IPB003006B 20.23 8.71e-12 1996-2033 
IPB003006B 20.23 9.14e-12 2289-2326 
IPB003006B 20.23 1.00e-ll 2096-2133 


813 


PR01536 


Interieukin-l receptor type I and type 
II familv ^itniature If! 


PR01536C 19.92 9.10e-ll 1707-1730 
PR01536C 19.92 9.23e-ll 2395-2418 
IPB003006B 20.23 4,60e-10 1700-1737 
IPB003006B 20.23 6.40e-10 1902-1939 
IPB003006B 20.23 8.92e-10 1603-1640 
IPB003006B 20.23 9.64e-10 2388-2425 
IPB003006B 20.23 3.42e-09 1506-1543 


813 


PR01076 


Caldesmon signature IV 


PR01076D 8.07 5.07e-09 1457-1478 
IPB003006B 20.23 7.58e-09 1799-1836 
IPB003006B 20.23 8.62e-09 2194-2231 
PR01536C 19.92 9. 19e-09 2003-2026 


813 


PR01472 


Intercellular adhesion 
molecule/vascular cell adhesion 
molecule- 1 signature I 


PR01472A 16.78 9.64e-09 1755-1771 


814 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 7.60e-16 219-233 


814 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.71e-12 623-660 
IPB003006B 20.23 8.71e-12 2191-2228 
IPB003006B 20.23 9.14e-12 2484-2521 
IPB003006B 20.23 1.00e-ll 2291-2328 


814 


PRO 1536 


Interleukin-1 receptor type I and type 
II family signature III 


PR01536C 19.92 9.10e-ll 1902-1925 
PR01536C 19.92 9.23e-U 2590-2613 
IPB003006B 20.23 4.60e-10 1895-1932 
IPB003006B 20.23 6.40e-l0 2097-2134 
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IPB003006B 20.23 8.92e-10 1798-1835 
IPB003006B 20.23 9.64e-10 2583-2620 
IPB003006B 20.23 3.42e-09 1701-1738 


814 


PRO 1076 


Caldesmon signature IV 


PR01076D 8.07 5.07e-09 1652-1673 
IPB003006B 20.23 7.58e-09 1994-203 1 
IPB003006B 20.23 8.62e-09 2389-2426 

tio/M ci^n 1 n aa a 1 r\ _ aa a 1 no aaa 1 

PR01536C 19.92 9. 19e-09 2198-2221 


814 


PR01472 


Intercellular adhesion 
molecule/vascular cell adhesion 
molecule- 1 signature I 


PR01472A 16.78 9.64e-09 1950-1966 


816 


IPB000074 


Apolipoprotein A1/A4/E 


IPB000074B 29.17 7.49e-10 117-170 
IPB000074B 29.17 8.75e-10 95-148 
IPB000074B 29.17 9.20e-10 62-115 
IPB000074C 22.23 2.62e-09 90-127 
IPB000074C 22.23 4.35e-09 112-149 
IPB000074B 29.17 8.48e-09 201-254 


817 


IPB000074 


Apolipoprotein A1/A4/E 


IPB000074B 29.17 7.49e-10 117-170 
OT3000074B 29.17 8.75e-10 95-148 

TITOA AAA*7 vlT* 1A n A OA. 1 A £A llf 

IPB000074B 29.17 9.20e-10 62-115 
IPB000074C 22.23 2.62e-09 90-127 
IPB000074C 22.23 4.35e-09 112-149 
IPB000074B 29. 17 8.48e-09 201-254 * 


819 


IP BOO 12 ll 


Phospholipase A2 


IPB0012HB 17.16 3.12e-31 44-71 


819 


PR00389 


Phospholipase A2 signature III 


PR00389C 17.85 2.50e-20 56-74 
PR00389B 10.67 6.91e-16 37-55 
IPB00121 ID 1 1.66 5.50e-l4 104-1 19 
PR00389E 13.06 8.20e-14 104-120 
IPB001211C 14.62 1.56e-ll 79-97 


821 


IPB001354 


Mandelate racemase/muconate 
lactonizing enzyme family 


IPB001354C 32.55 1.00e-24 210-251 
IPB001354D 32.92 2.07e-18 281-326 
IPB001354B 18.16 3.91e-18 87-113 
IPB001354E 9.47 6.23e-09 370-382 


822 


IPB002164 


Nucleosome assembly protein (NAP) 


IPB002164B 25.75 1.00e-36 102-138 j 
IPB002164A 24.21 6.40e-34 21-58 
IPB002164C 11.48 6.68e-21 151-170 


822 


1PB000135 


High mobility group proteins HMG1 
andHMG2 


innAAAHf r» a i a e a*7 i i *\ar AAA 

IPB000135D 2.13 5.27e-13 285-309 
IPB000135D2.13 1.41e-U 286-310 
IPB000135D2.13 l.82e-ll 283-307 
IPB000135D 2.13 3.76e-ll 289-313 
IPB000135D 2.13 3.97e-l 1 287-31 1 
IPB000135D2.13 4.27e-ll 288-312 
IPB002164D 9.19 7.65e-ll 232-242 
IPB000135D 2.13 1.68e-10 282-306 

TTlOAAAl A CA A 11 /I AA_ 1 A A O 1 1Af 

IPB000135D 2.13 4.03e-l0 281-305 
IPB000135D 2.13 4.91e-10 284-308 


822 


IPB001580 


Calreticulin family 


IPB0O158OF 2.93 2.35e-09 300-309 
IPB000135D 2.13 2.64e-09 280-304 
IPB000135D 2.13 6.27e-09 291-315 

TDDAAA 1 O 1 1 *7 17o AO TOO 11 /C 

IPB000135D 2.13 7.55e-09 279-303 
IPB000135D2.13 8.91e-09 290-314 


822 


IPB001326 


Elongation factor 1 beta/betaVdelta 
chain 


IPB001326C 9.19 9.16e-09 286-301 


823 


IPB000222 


Protein phosphatase 2C subfamily 


IPB000222F 19.87 4.94e-15 256-276 
IPB000222E 14.28 6.33e-15 228-246 
IPB000222G9.17 1.95e- 12 282-295 
IPB000222C 6.84 2.08e-12 147-156 
IPB000222H 9.33 7.97e-12 318-330 
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IPB000222B 15.80 2.86e-10 115-125 

TPRf1ftft999r* 1 1 HA H HA a. flO tO£ Hf\1 

irDUUUZZZJJ 1 1. /4 z. /4e-Uy loO-zOJ 
IPB000222I 8.91 4.72e-09 379-388 


S9A 

OZH 


TPRHfi 1 Ono 


von w nieoranu iacior, type 
repeal 




825 


PR00245 


Olfactory receptor signature III 


PR00245C 14.65 9.53e-17 59-75 


825 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11.56 9.25e-l4 1-12 
PR00245D 9.34 1.53e-13 119-128 

PR00245E 8.96 6.81e-12 166-177 

DDnno/tin io hi i nn« m to ha 
rKUUZ4j.b 13. Io l.UUe-lU 12-24 

IPB000276D 9.40 3.08e-09 165-181 


825 


PR00237 


Rhodopsin-like GPCR superfamily 
signature V 


PR00237E 13.03 3.83e-09 82-105 

DT?nnOT7/T. 1QOO. 1 flAo AO tec 101 

rKUUzo/tj iy.Zo l.UUe-Uo Ijj-IoI 


826 


PR00245 


Olfactory receptor signature III 


PR00245C 14.65 9.53e-17 173-189 


826 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB0Q0276A 11.56 9.25e-14 117-128 
PR00245D 9.34 1.53e-13 233-242 
rKUUz4Dc e.yo o.oie-iz zou-zyi 

DT?f)AO/15 A 1 A AO 7 Mo 11Q1 1 AO 

rivuuz**jA lu.yc /. i4e-iz yi-iuz 
PR00245B 13.73 8.14e-10 128-140 


826 


PR00237 


Rhodopsin-like GPCR superfamily 
signature in 


PR00237C 14.77 2.02e-09 103-125 

rPRfin,A97/?n o 4n i no» no 070 oo^ 
irr>uuuz/ojj y.4u j.uce-uy z/y-zyj 


826 


r ivuujjt 


IVidailUwUi L1I1 IC^CpLUi Lalimy 
■jliilldlLUQ I 




826 


PR 00896 


\fn QfinfT^Qin rf*f*f*ntnr Qiorintiirp TT 


PR00896B 9 36 7 23e-09 54-65 

f 1VVV07UU J .-)\J /.^JW «/*T^V/«' 

PR00237G 19.23 1.00e-08 269-295 


827 


IPB001169 


"Integrin beta, C-terminus M 


IPB001 169J 7.42 4.63e-10 40-53 


827 


PR01186 


Integrin beta subunit signature XI 


PR01 186K 7.39 7.27e-10 40-53 
IPB001169K 27.45 5.50e-09 42-84 
PR01 186K 7.39 9.75e-09 6-19 


COO 




KiiouAr aomain 


LrDUUUiyoi^ io.4y i.zse-iu zzo-z**j 


R9Q 
oZy 




lud aomain 


TPRnnns^o io oo 7 nn#> 9^ in /t^ 
Lrouuuojy ly.yy /,uue-Zj iu-hj 


oJU 


ipnnnnoco 


^ujj uomain 


TPnnnnji^o io 00 7 nn*» 9^ 10.-45 
irDUUuojy ly.yy /.uue-zj iu-^j 


831 


PR00193 


Myosin heavy chain signature III 


PR00193C 11.66 9.77e-24 177-204 


831 


IPB000857 


Core domain in kinesin and myosin 
motors 


IPB0OO857C 10.82 4.84e-19 175-197 
PR00193B 12.36 6.81e-18 125-150 
IPB000857D 12.93 8.28&-1 8 204-242 
PR00193A 14.87 8.50e-12 65-84 
IPB000857A 15.90 5.58e-ll 42-95 

iPRnnnftS7R 1 1 'xs 1 nn#» in nv\-i^9 
ix-duuuoj id vi.jO i.uue-iu iuo-ivz 


831 


PR00364 


Disease resistance protein signature I 


PR00364A 8.29 4.86e-09 127-142 




r K.UU I y j 


Myosin heavy chain signature III 


DDAAtOI/* 1 1 1 £LC Q HHa. HA 1 HH HC\A 

rKUUiyiC 1 1.00 V. / /e-z4 1 / /-ZU4 


832 


IPB000857 


Core domain in kinesin and myosin 
motors 


IPB000857C 10.82 4 : 84e-19 175-197 
PR00193B 12.36 6.81e-18 125-150 
IPB000857D 12.93 8.28e-18 204-242 

TUDAAAOCTC OC f\H 1 zlO«. IO OOO HA1 

lrBUUUoj /b zj.07 l.4/e-lz zoo- 34 1 

DDAA1Q1 A 1/1 OO 0 CAn IO Qvl 

rKOUlyiA 14.0/ o.jUe-lz Oj-o4 
IPB000857A 15 90 5 58e-ll 42-95 
IPB000857B 11.35 1.00e-10 106-152 


832 


PR00364 


Disease resistance protein signature I 


PR00364A 8.294.86e-09 127-142 
IPB0O0857F 15.97 6.50e-09 365-397 


834 


IPB002350 


Kazal-type serine protease inhibitor 
family 


IPB002350 31.78 2.86e-18 143-183 


834 


IPB000716 


Thyroglobulin type-1 repeat 


IPB000716C 17.62 2.88e-18 336-354 
BPB000716D 15.49 7.16e-15 358-372 


834 


IPB001999 


Osteonectin domain 


IPB001999E 15.70 7.99e-ll 272-318 


835 


IPB001323 


Erythropoietin/thrombopoeitin 


IPB001323A 17.37 8.31e-10 515-547 
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835 


PR00251 


Bacterial opsin signature I 


PR00251A 13.93 9.75e- 10 515-534 


835 


PR00807 


Pollen allergen Amb family signature 
I 


PR00807A 16.15 7.4 le-09 459-476 


836 


IPB001323 


Erythropoietin/thrombopoeitin 


IPB001323A 17,37 8.31e-10 515-547 


836 


PR00251 


Bacterial opsin signature I 


PR00251A 13.93 9.75e-10 515-534 


836 


PR00807 


Pollen allergen Amb family signature 
I 


PR00807A 16.15 7.4 le-09 459-476 


838 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 5.50e-13 359-373 


838 


PR00019 


Leucine-rich repeat signature I 


PR00019A 11.72 9.33e-10 278-291 
PR00019A 1 1.72 9.33e-10 327-340 
PR00019B 11.42 6.73e-09 179-192 
PR00019A 11 72 7 27e-09 182-195 


840 


IPB000243 


Proteasome B-type subunit 


IPB000243C 13.61 8.80e-09 345-355 


841 


IPB002889 


WSC domain 


IPB002889B 11.76 9.36e-U 527-573 


841 

OH I 




Proline rirOi pytpncin cion5>tiii*p \f 


PR01217B 4.82 5.65e-10 533-549 
PR01217D 4 57 7 86e-lG 529-550 


841 


IPB000906 


ZU5 domain 


IPB000906A 22.49 8.91e-10 158-200 
PR01217C 4 49 4 80e-09 5^8-550 
IPB000906E22.il 4.83e-09 162-202 
PR01217G 4.02 5.03e-09 529-554 


841 


PR01415 


Ankyrin repeat signature II 


PR01415B 10.23 5.88e-09 177-189 
PR01415A 12 73 8 00e-09 165-177 
PR01415A 12.73 8.75e-09 131-143 


841 


IPB000925 


Pneumo virus attachment 
glycoprotein G 


IPB000925D 14.69 9.33e-09 404-426 
PR01217A 5.97 9.62e-09 539-551 


842 


IPB000416 


Outer Capsid protein VP4 
(Hemagglutinin) 


IPB000416P 15.37 7.40e-09 185-223 


843 


IPB000416 


Outer Capsid protein VP4 
( Hemaeff I utini t\\ 


IPB000416P 15.37 7.00e-09 185-223 


844 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006A 17.51 7.1 le-09 354-376 


845 


IPB000998 


MAM domain 


IPB000998C 18.63 1.95e-12 833-848 
IPB000998B 17.20 1.62e-ll 761-773 


845 


PR00020 


MAM domain signature I 


PR00020A 20.48 3.62e-ll 759-777 
PR00020C 12.01 8.12e-10 832-843 
IPB000998D 18.66 9.61e-l0 898-921 


845 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006A 17.51 7.1 le-09 354-376 


845 


PR00096 


Glutamine amidotransferase 
superfamily signature III 


PR00096C 15.85 9.28e-09 534-547 


846 


IPB003160 


p53-associated protein (MDM2) 


IPB003160A 14.23 8.01e-09 82-129 


847 


IPB002642 


LvsoDhosnholinase catalvtic domain 


IPB002642B 11 84 4 38e-15 1134-1158 
IPB002642A 18.37 1.69e-13 1106-1131 


847 


PR00360 


C2 domain signature II 


PR00360B 11.64 8.67e-12 839-852 
IPB002642G34.il 6.72e-10 1429-1477 


847 


IPB000008 


C2 domain 


IPB000008C 23.37 2.44e-09 812-851 


848 


IPB002642 


Lysophospholipase catalytic domain 


IPB002642B 11.84 4.38e-15 383-407 
IPB002642A 18.37 1.69e-13 355-380 


848 


PR00360 


C2 domain signature II 


PR00360B 11.64 8.67e-12 88-101 
IPB002642G 34. 1 1 6.72e-10 678-726 
IPB002642E 18.19 6.91e-10 509-534 


848 


IPB000008 


C2 domain 


IPB000008C 23.37 2.44e-09 61-100 


851 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 1.43e-13 203-240 


851 


IPB003531 


Short hematopoietin receptor family 


IPB003531C 15.87 9.38e-ll 449-466 
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1 


IPB003006B 20.23 6.54e-09 81-118 


852 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 1.43e-13 199-236 


852 


IPB003531 


Short hematopoietin receptor family 
1 


IPB003531C 15.87 9.38e-ll 445-462 
IPB003006B 20.23 6.54e-09 77-114 


854 


IPB000008 


C2 domain 


IPB000008C 23.37 7.94e-25 306-345 
IPB000008C 23.37 U7e-16 173-212 


854 


PR00360 


C2 domain signature II 


PR00360B 11.64 8.20e-14 200-213 
PR00360A 15.18 1.60e-13 304-316 


854 


PR00399 


Synaptotagmin signature II 


PR00399B 14.30 1.69e-12 291-304 
IPdOOOOOSD 14.83 3.45e-l 1 229-247 
IPB000008D 14.83 3.86e-ll 361-379 
rKUUJoUo H.o4 j.y4e-ll 333-346 
PR00399A 15.05 6.40e-ll 145-160 
PR00360A 15.18 8.36e-ll 173-185 
PR00399C 15.89 4.98e-10 348-363 
PUfinioon 19 no < ii 0 iakq no 

TPRnnnnnRP *y\ xi q i&t* in i7<> o\&\ 

irDuuwuoL' zj.j/ y. /oc-iU 1 /j-ZH 
PR00399B 14.30 6.57e-09 160-173 

PR 003QQ A 1 S OS 8 6^p-fiQ 976-9Q1 


854 


IPB002618 


IJTP— fflucose- 1 -nhnsnhate 
uridylyltransferase 


TPROOlfil 8D 9Q 94 Q RRp-00 1 89-994 


855 


IPB002870 


ReDrolvsin familv nronentide 


IPR002870R 94 71 1 78<»-14 141-170 
IPB002870E 1 1.90 4.67e-14 391-403 
IPB002870F 18.81 7.00e-13 432-456 
IPB002870D 16 31 6 62e-19 360-375 


855 


IPB001762 


Disintegrin 


IPB001762A 23.93 1.40e-ll 336-376 


855 


IPB000130 


"Neutral zinc metallonentidases 
zinc-binding region" 


IPB000130 5 86 5 15e-1 1 18Q-1Q9 


855 


PR00480 


Astacin familv signature IT 

xijuivui rn.sm.ikl j OlgUuLUl v IL 


PR00480R 14 IS 4 54<»-1fl 184-409 


855 


PR01303 


Plasmodium circumsnnroznite 
protein signature IV 


PR01303D 10 57 4 71 e-10 95^-970 
PR01303D 10 57 2 75e-09 833-850 


855 


IPB001670 


Iron-containing alcohol 
dehydrogenase 


IPB001670D 13 90 5 50e-09 157-172 
IPB002870C 11.01 5.68e-09 317-327 
PR01303D 10.57 6.38e-09 552-569 


855 


IPB001862 


Membrane attack complex 
components/perforin/complement C9 


IPB001862A 12.54 6.66e-09 540-555 


856 


IPB003952 


Fumarate reductase / succinate 
dehydrogenase FAD-binding site 


IPB003952A 6.70 8.00e-09 14-28 


857 


PR00833 


Pollen allergen Poa pi signature VTII 


PR00833H 2.61 4.1 le-09 58-72 


857 


IPB002989 


Mycobacterial pentapeptide repeats 


IPB002989C 13.82 8.67e-09 48-87 


858 




r uiien aiicigcn rod pi signature viii 


PRfiflSHW 9 £1 A 1 I** no <1 


859 


IPB001442 


C-terminal tandem repeated domain 
in type *r piuiuiidgcn 


IPB001442A 26.12 8.26e-26 254-306 


859 


IPB000885 


Fibrillar collagen C-terminal domain 


IPB000885B 19.15 6.77e-24 265-318 
IPB000885B 19.15 9.30e-24 247-300 
IPBO00885B 19.15 1.42e-23 244-297 
IPB001442A26 12 5 96e-23 257-309 
IPB001442A 26.12 8.83e-23 266-318 
IPB001442A 26.12 8.96e-23 239-291 
IPB000885B 19.15 9.45 


859 


PR01408 


Macrophage scavenger receptor 
signature VIII 


PR01408H 14.32 5.76e-16 227-246 


859 


PR00258 


Speract receptor signature I 


PR00258A 13.56 6.32e-16 333-349 
IPB001442A 26.12 8.12e-16 272-324 
IPB000885A 11.46 4.16e-15 255-292 
EPB000885B 19.15 5.76e-15 274-327 
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IPB000885A 1 1.46 5.86e-15 270-307 
IPB001442A 26.12 7.88e-15 230-282 
IPB000885A 1 1.46 2.87e-14 276-313 
IPB000885B 19.15 3.43e-14 229-282 
IPB000885B 19.15 4.13e-14 277-330 
IPB000885A 1 1 46 S 44e-14 141-980 
IPB000885A 11.46 7.78e-14 285-322 
IPB000885B 19.15 7.88e- 14 280-333 


859 


IPB001073 


Complement Clq protein 


IPB001073A 22.14 8.40e-14 263-297 
IPB000885B 19.15 5.21e-13 226-279 
IPB001073A 22. 14 5.79e-13 269-303 
PR00258B 7.94 8.42e-13 352-363 
IPB001442B 12.38 9.00e-13 270-290 
IPB001442A 26.12 9.16e-13 227-279 
IPB001073A 22.14 1.54e-l 


859 


IPB000817 


Prion protein 


IPB000817A 8.34 5.85e-10 244-286 
IPB001073A 22.14 6.80e-10 287-321 
IPB000817A 8.34 8.22e-10 247-289 
IPB001442B 12.38 8.46e- 10 246-266 
IPB000885A 11.46 9.32e-10 234-271 
IPB001442A 26.12 9.42e-10 284-336 
IPB000885A 11.46 9.61e-10 288-325 
IPB001442B 12.38 1.24e-09 264-284 
IPB001442A 26 12 1 63e-09 221-971 
IPB001073A 22.14 2.83e-09 251-285 
IPB001073A 22 14 3 53e-09 284-118 
IPB001442B 12.38 4.65e-09 291-311 
IPB001442B 12.38 4.77e-09 249-269 
IPB001073A 22.14 5.64e-09 278-312 
IPB000885A 1 1.46 5.87e-09 291-328 
IPB001442B 12.38 6.11e-09 273-293 
IPB001442B 12.38 6.84e-09 294-314 
IPB001073A 22.14 7 61e-09 239-273 


860 


IPB001442 


C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442A 26. 12 8.26e-26 314-366 


860 


IPB000885 


Fibrillar collagen C-terminal domain 


IPB000885B 19.15 4.52e-24 307-360 
IPB000885B 19.15 6.77e-24 325-378 
IPB000885B 19.15 i.69e-23 304-357 
IPB001442A 26.12 5.96e-23 317-369 
IPB001442A 26.12 6.35e-23 299-351 i 
IPB001442A 26. 12 8.83e-23 326-378 ; 
IPB000885B 19.15 1.26 


860 


PR01408 


Macrophage scavenger receptor 
signature VIII 


PR01408H 14.32 5.76e-16 287-306 


860 


PR00258 


Speract receptor signature I 


PR00258A 13.56 6.32e-l6 393-409 
IPB001442A 26.12 8.12e-16 332-384 

TPR0n088SA 11 46 4 1£p 1^11^ IK") 

IPB000885B 19.15 5.76e-15 334-387 
IPB000885A 11.46 5.86e-15 330-367 
IPB000885B 19.15 7.35e-15 289-342 
IPB001442A 26.12 7.88e-15 290-342 
IPB000885A 11.46 2.87e-14 336-373 
IPB000885B 19.15 4.13e-14 337-390 
IPB000885A 1 1.46 5.91e-14 303-340 
IPB000885A 1 1.46 7.78e-14 345-382 
IPB000885B 19.15 7.88e-14 340-393 


860 


IPB001O73 


Complement Clq protein 


IPB001073A 22.14 8.40e-14 323-357 
IPB000885B 19.15 5.70e-13 286-339 
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IPB001073A 22.14 5.79e-13 329-363 
IPB001442A 26.12 7.28e-13 287-339 
PR00258B 7.94 8.42e-13 412-423 
IPB001442B 12.38 9.00e-13 330-350 
IPB001073A 22.14 1.54e-l 


860 


IPB000817 


Prion protein 


IPB000817A 8.34 5.65e-10 304-346 j 
IPB001073A 22.14 6.03e-10 311-345 
IPB001073A 22.14 6.80e-10 347-381 
PR00258C 9.05 7.15e-10 427-437 
PR00258D 14.29 8.06e-10 458-472 
IPB000817A 8.34 8.42e-10 307-349 
IPB001442A 26.12 9.42e-10 344-396 
IPB000885A 11.46 9.61e-10 348-385 
IPB001073A 22.14 9;69e-10 299-333 
IPB000885B 19.15 9.83e-10 283-336 
IPB000885A 11.46 9.90e-10 294-331 
IPB001442B 12.38 1.24e-09 324-344 
IPB001442A 26.12 2.41e-09 281-333 
IPB001442B 12.38 2.70e-09 309-329 
IPB00 1073 A 22. 14 3.53e-09 344-378 
IPB001442B 12.38 4.65e-09 351-371 
IPB001073A 22.14 5.64e-09 338-372 
IPB000885A 11.46 5.87e-09 351-388 
IPB001442B 12.38 6.11e-09 333-353 
IPB001442B 12.38 6.84e-09 354-374 
•PR01408B 9.2 1 9.84e-09 58-83 


862 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 8.20e-22 222-247 
IPB000822 14.67 5.09e-21 306-331 
IPB000822 14.67 5.50e-20 474-499 
IPB000822 14.67 7.00e-20 446-471 
IPB000822 14.67 3.25e-19 390-415 
IPB000822 14.67 4.00e-19 194-219 
IPB000822 14.67 7.00e- 19 278-303 
IPB000822 14.67 4.46e- 18 362-387 
1PB000822 14.67 6. 14e- 17 250-275 
[PB000822 14.67 3.40e-16 418-443 
IPB000822 14.67 4.00e-16 334-359 


862 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 5.85e-14 415-428 
PR00048A 9.94 S.07e-13 219-232 
PR00048A 9.94 3.12e-12 387-400 
PR00048A 9.94 4.71e-12 247-260 
PR00048A 9.94 4.71e-12 331-344 
PR00048B 5.52 7.00e-l2 487-496 


862 


IPB001275 


DM DNA binding domain 


1PB001275 19.17 7.04e-12 266-305 
PR00048A 9.94 7.88e-12 499-512 
PR00048A 9.94 1.95e-ll 471-484 
PR00048A 9.94 4.32e-l 1 443-456 
PR00048B 5.52 5.50e-ll 319-328 
PR00048A 9.94 1.00e-10 191-204 
IPB0O1275 19.17 i.36e- 10 294-333 
IPB0O1275 19.17 1.49e- 10 350-389 
PR00048A 9.94 5.09e-10 303-316 
IPB001275 19.17 5.14e-10 378-417 


862 


IPB002817 


ThiC family 


IPB002817H 1 1.39 5.42e-10 217-232 
PR00048A 9.94 5.9le-10 359-372 
IPB0O1275 19.17 8.18e-10 182-221 
IPB001275 19.17 9.15e-10 322-361 
PR00048B 5.52 9.36e-10 375-384 
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IPB001275 19.17 9.39e-10 210-249 
IPB001275 19.17 9.39e-l0 238-277 
PR00048B 5.52 2.00e-09 207-216 
IPB000822 14.67 2.13e-09 502-527 
PR00048B 5.52 2.50e-09 459-468 
IPB001275 19.17 2.71e-09 462-501 
PR00048B 5.52 3.00e-09 403-412 
IPB001275 19.17 3. 62e-09 406-445 
PR00048A 9.94 4.38e-09 275-288 


862 


IPB000306 


"FYVEZn-finger, 
rabphiUn/VPS27/FABl type" 


IPB000306 8.96 4.71e-09 218-230 
PR00048B 5.52 5.50e-09 291-300 
IPB000306 8.96 5.76e-09 498-510 
IPB000306 8.96 6.03e-09 302-314 
PR00048B 5.52 7.00e-09 235-244 
IPB002817H 11.39 7.34e-09 301-316 
IPB001275 19.17 8.1 8e-09 434-473 


862 


IPB002634 


BolA-like protein 


IPB002634A 23.30 8.62e-09 243-277 


864 


IPB000571 


Zinc finger C-x8-C-x5-C-x3-H type 


IPB000571 11.41 6.54e-10 66-76 


864 


PR01218 


Pistil-specific extensin-like signature 
II 


PR01218B 8.47 9.12e-09 140-163 


865 


PR00320 


G protein beta WD-40 repeat 
signature II 


PR00320B 12.82 5.68e-10 225-239 
PR00320A 13.15 7.48e- 10 225-239 


865 


IPB001680 


G-protein beta WD-40 repeats 


IPB001680 10.43 4.15e-09 227-238 
PR00320C 12.32 9.67e-09 225-239 


867 


IPB000954 


Aminotransferase class-ID pyridoxal- 
phosphate 


IPB000954B 21.02 9.25e-25 291-330 
IPB000954A 20.25 7.12e-18 98-127 
IPB000954D 13.61 5.74e-17 377-395 
IPB000954C 12.88 9.44e-14 340-355 


868 


IPB000954 


Aminotransferase class-Ill pyridoxal- 
phosphate 


IPB000954B 21.02 9.25e-25 188-227 
IPB000954D 13.61 5.74e-l 7 274-292 
IPB000954C 12.88 9.44e-14 237-252 


869 


IPB001254 


"Serine proteases, trypsin family" 


IPB001254C 16.54 2.50e-17 270-289 


869 


IPB000177 


Apple domain 


IPB000177O 14.39 l.lle-15 267-295 
IPB001254A 9.98 6.l4e-15 88-104 


869 


PR00722 


Chymotrypsin serine protease family 
(SI) signature III 


PR00722C 10.74 3.08e-14 236-248 
PR00722A 12.06 4.54e-14 89-104 
IPB001254B 15.01 7.14e-14 237-260 


869 


IPB000001 


Kringle 


IPB00000 ID 11.31 7.56e-l2 88-104 
IPB000001H 12.24 2.50e-il 239-249 
IPB000177N 10.17 3.23e-ll 229-263 
IPB000177K 13.19 2.57e-l0 90-122 
PR00722B 12.69 6.85e-10 145-159 


873 


IPB001862 


Membrane attack complex 
components/perforin/complement C9 


IPB001862F 29.39 6.19e-15 343-390 


873 


PR00010 


Type II EGF-like signature I 


PR00010A 12.91 4.94e-13 46-57 


873 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 7.55e-13 541-556 
IPB001862F 29.39 8.07e-13 553-600 
IPB001862F 29.39 9.14e-13 515-562 
IPB001862F 29.39 3.07e-12 35-82 
IPB00I862F 29.39 3.79e-12 73-120 
IPB001862F 29.39 4. 10e-12 304-351 
IPB000152 8.86 6.04e-12 61-76 
IPB001862F 29.39 8.45e-12 477-524 
IPB001862F 29.39 8.45e-12 1031-1078 | 
IPB000152 8.86 3.89e-l 1 137-152 
IPBOO 1 862F 29.39 4.00e- 1 1 1 53-200 
IPB000152 8.86 4.86e-ll 179-194 



WO 2004/080148 



PCT/US2003/030720 



383 
TABLE 3B 









IPB001862F 29.39 6.70e-l 1 381-428 
PR00010C 6.98 7.38*11 374-384 


873 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881B 12.28 7.63e-ll 137-148 
PR00010C 6.98 9.25e-ll 66-76 
IPB001862F 29.39 9.50e-ll 265-312 
PR00010A 12.91 l.OOe- 10 564-575 
IPB000152 8.86 1.84e- 10 369-384 
PR00010A 12.91 2.38e-10 354-365 
IPB001862F 29.39 2.63e-10 111-158 
PR00010A 12.91 2.73e- 10 488-499 


873 


PR00764 


Complement C9 signature VI 


PR00764F 15.74 2.92e-10 170-190 


873 


IPB000033 


"Low-density lipoprotein (ldl) 
receptor/ YWTD repeat" 


IPB000033B 7.05 3.03e-10 374-384 
PR00764F 15.74 3.16e-10 52-72 
PR00764F 15.74 3.52e-10 321-341 
PR00010C 6.98 3.90e-10 546-556 
IPB001881B 12.28 4.00e-10 541-552 
IPB000152 8.86 4.66e-10 503-518 
IPB001881A 8.72 4.86e-10 280-289 
PR00010A 12.91 5.50e-10 122-133 


873 


IPB002899 


EB module 


IPB002899B 11,81 5.59e- 10 243-255 
IPB000152 8.86 6.06e-10 407-422 
IPB000033B 7.05 6.23e-10 296-306 
IPB000152 8.86 6.63e-10 291-306 | 
IPB001881A 8.72 7.43e- 10 319-328 
IPB001881A 8.72 7.43e-10 530-539 
IPB001881A 8.72 8.07e-10 126-135 
IPB001881B 12.28 8.29e- 10 255-266 
IPB000152 8.86 8.31e-10 23-38 
PR00764F 15.74 8.44e-10 360-380 
PR00764F 15.74 8.44e- 10 570-590 
IPB001881A 8.72 9.36e-10 168-177 
PR00764F 15.74 9.52e-10 398-418 
IPB000152 8.86 9.72e-10 255-270 
PR00010C 6.98 1.00e-09 296-306 
IPB00188lA8.72 2.20e-09 1046-1055 


873 


PR00011 


Type III EGF-like signature II 


PR00011B 13.08 2.23e-09 63-81 
IPB001881B 12.28 2.57e-09 179-190 
IPB001881A 8.72 2.80e-09 358-367 


873 ! 


IPB003884 


Factor I membrane attack complex 


IPB003884C 13.00 2.83e-09 572-590 


873 


IPB000561 


EGF-like domain 


IPB000561 4.89 2.93e-09 626-634 
PR00010C 6.98 3.63e-09 28-38 
IPB000561 4.89 4.21e-09 378-386 


873 


IPB000359 


Cystine-knot domain 


IPB000359A 23.24 4.33e-09 70-94 
IPB000561 4.89 4.86e-09 108-1 16 
IPB000359A 23.24 4.91e-09 108-132 
PR00010C 6.98 6.05e-09 184-194 
IPB001881A 8.72 6.40e-09 50-59 


873 


IPB000034 


Laminin B 


IPB000034C 12.97 6.49e-09 70-88 
PR00010A 12.91 7.27e-09 164-175 
PR00010A 12.91 7.27e-09 315-326 


873 


IPB001886 


Laminin N-terminal (Domain VI) 


IPB001886C 24.54 7.40e-09 300-339 
IPB000561 4.89 7.43e-09 223-231 
IPB000561 4.89 7.43e-09 550-558 
PROOOUD 12.12 7.81e-09 371-389 
IPB000152 8.86 8.11e-09 330-345 
IPB000359A 23.24 8.24e-09 512-536 j 


873 


IPB000006 


"Vertebrate metallothionein, family 
1" 


IPB000006 13.41 8.62e-09 75-120 
PR00010C 6.98 8.68e-09 412-422 
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PR00764F 15.74 9.20e-09 282-302 
IPB000033B 7.05 9.29e-09 546-556 
IPB001862F 29.39 9.36e-09 591-638 
IPB001881A 8.72 9.40e~09 568-577 
PR00764F 15.74 9.43e-09 532-552 
PR00010A 12.91 9.45e-09 526-537 
PR0001 ID 12.12 9.74e-09 25-43 
IPB000033B 7.05 1.00e-08 66-76 


874 


PR00960 


LmbP protein signature I 


PR00960A 10.63 4.67e-09 78-93 


875 


IPB000043 


S-adenosyl-L-homocysteine 
hydrolase 


IPB000043D 24.21 l.OOe-40 235-289 
IPB000043E 21.11 l.OOe-40 298-350 
IPB000043A 16.26 4.72e-33 119-156 
IPB000043H 17.16 1.72e-29 459-493 
IPB000043F 16.20 2.55e-24 351-377 
IPB000043G 18.51 3. 25e-24 411-448 
IPB000043B 18.62 5.95e-23 158-191 
IPB000043G 18.51 7.16e-15 412-449 
IPB000043C 8.96 9.61e-15 202-216 


878 


IPB002181 


Fibrinogen beta and gamma chains 
C-terminal globular domain 


IPB002181B 20.16 7.49e-24 181-217 
IPB002181D 29.18 7.32e-15 243-283 
IPB002181C 15.87 2.64e-10 222-234 


879 


IPB002181 


Fibrinogen beta and gamma chains 
C-terminal globular domain 


IPB002181B 20.16 7.49e-24 181-217 
IPB00218 ID 29.18 7.32e-15 243-283 
IPB002181C 15.87 2.64e-10 222-234 


880 


IPB002181 


Fibrinogen beta and gamma chains 
C-terminal globular domain 


IPB002181B 20.16 7.49e-24 181-217 
IPB002181D 29.18 7.32e-15 243-283 
IPB002181C 15.87 2.64e-10 222-234 


QQ1 


1PBU02027 


Amino acid permease 


IPB002027D 22.00 4. 13e-25 325-364 
IPB002027C 19.67 2.74e-22 244-282 
IPB002027A 18.88 3.77e-16 47-75 
IPB002027B 12.67 7.97e-12 180-199 


554 


Ir JdUU 1 / 11 


Kinase associated domain 1 


IPB001772E 24.88 4.03e-10 620-659 


55*t 


IrrJUUUoO 1 


PKN/rhophilin/rhotekin rho-binding 
repeat 


IPB000861D 13.61 7.34e-10 97-133 


884 


IPB000961 


Protein kinase C-terminal domain 


IPB000961A 16.82 8.45e-09 99-133 


884 


IPB003527 


MAP kinase 


IPB003527D 21.53 9.15e-09 462-503 


885 


IPB001304 


C-type lectin domain 


IPB001304A 17.98 8.04e-14 34-58 


891 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 1.72e-10 103-140 


891 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 1.31e-09 155-169 
PR00049D 0.00 6.80e-09 156-170 


892 


PR00503 


Bromodomain signature IV 


PR00503D 19.24 3.57e-21 421-440 


892 


IPB001487 


Bromodomain 


IPB001487B 17.44 2. 13e-19 412-433 
PR0O503B 10.44 4.37e-19 94-110 
IPB001487A 11.44 5.20e- 19 95-113 
PR00503C 19.09 4,00e-17 110-128 
IPB001487A 11.44 9.53e-16 388-406 
PR00503A 14.57 4.00e-14 78-91 
PR00503B 10.44 8.64e-14 387-403 


892 


IPB001359 


Synapsin 


IPB001359H 22.58 1.65e-13 752-802 
PR00503D 19.24 9.25e-13 128-147 
IPB001487B 17.44. 1.58e-12 119-140 \ 
PR00503C 19.09 6.70e-ll 403-421 


892 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 8.87e-ll 755-769 
PR00049D0.00 9.47e-ll 756-770 
IPB001359H 22.58 9.70e-ll 979-1029 


892 


PR00209 


Alpha/beta gliadin family signature II 


PR00209B 4.73 4.80e-I0 966-984 
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892 


1PB003861 


E4 protein 


IPB003861B 9.06 4.86e-10 979-993 


892 


1PB001505 


"Cut A) centre of cytochrome c 
oxidase, subunit II and nitrous oxide 
reductase" 


IPB001505B 15.93 5.94e- 10 406-455 
PR00209B 4.73 6.90e-10 968-986 
IPB001359H 22.58 7.40e-l0 753-803 


892 


PR01471 


Histamine H3 receptor signature V 


PR01471E 5.41 7.44e-10 765-780 
IPB001359H 22.58 7.77e-10 962-1012 
PR00209B 4.73 9.80e-10 752-770 
IPB001505A 18.04 1.17e-09 93-140 
PR00049D 0.00 2.22e-09 748-762 
IPB003861B 9.063.15e-09 763-777 
PR00049D 0.00 3.29e-09 972-986 
IPB001359H 22.58 3.88e-09 757-807 
PR01471E 5.41 4.03e-09 981-996 
PR01471E5.41 4.23e-09 1019-1034 
IPB003861B 9.06 4.52e-09 754-768 


892 


IPB003351 


Dishevelled specific domain 


IPB003351C 13.82 5.13e-09 485-524 
IPB001359H 22.58 5.19e-09 941-991 
PR01471E 5.41 5.99e-09 755-770 
PR00503A 14.57 6.81e-09 371-384 
IPB001359H 22.58 7.03e-09 765-815 
IPB001359H 22.58 7.03e-09 970-1020 


892 


PR01217 


Proline rich extensin signature IV 


PR01217D 4.57 7.49e-09 239-260 


892 


PR01503 


Treacher Collins syndrome protein 
Treacle signature II 


PR01503B 3.77 7.64e-09 702-715 


892 


IPB000574 


Tymovirus coat protein 


IPB000574A 32.18 7.78e-09 254-301 


892 


PR00910 


Luteovirus ORF6 protein signature I 


PR00910A 2.74 8.07e-09 255-267 
IPB001359H 22.58 8.25e-09 978-1028 
IPB001359H 22.58 8.51e-09 193-243 
IPB001359H 22.58 8.51e-09 745-795 
IPB001359H 22.58 9.04e-09 754-804 


892 


IPB001978 


Troponin 


IPB001978B 22.99 9.15e-09 530-561 
PR00209B 4.73 9.90e-09 758-776 


893 


IPB003112 


Olfactomedin-Iike domain 


IPB003112C 13.54 4.69e-33 343-383 
IPB0031 12E 16.12 5.24e-33 416-458 
IPB003112B 14.91 6.65e-27 269-320 
IPB003112D 17.44 9.58e-23 384-410 
IPB003112A 14.44 2.97e-13 230-245 


893 


PRO 1444 


Latrophilin receptor signature V 


PR01444E 11.17 7.70e-12 346-361 


893 


PR00952 


Type III secretion system inner 
membrane Q protein family signature 
III 


PR00952C 21.25 2.04e-09 7-29 j 


893 


IPB002862 ! 


Protein of unknown function DUF16 


IPB002862C 11.30 9.59e-09 80-102 


894 


IPB002350 


Kazal-type serine protease inhibitor 
family 


IPB002350 31.78 4.12e-21 92-132 


894 


PR00290 


Kazal-type serine protease inhibitor 
signature I 


PR00290A 13.80 3.61e-12 92-102 


894 


IPB003006 


Immunoglobulin and major 
uioLu^uiiipdiiuiiuy complex uomain 


IPB003006B 20.23 1.36e-10 390-427 


894 


PR00450 


Recoverin family signature III 


PR00450C 1 1.99 5.04e-09 182-203 


895 


IPB001511 


Aminotransferases class-I 


IPB001511B 11.543. 14e-ll 177-191 


895 


PR00753 


1 -aminocyclopropane-l-carboxylate 
synthase signature V 


PR00753E 10.09 9.22e-ll 171-195 
IPB001511C 12.45 9.07e-10 243-256 


896 


IPB001781 


LIM domain 


IPB001781 11.42 3.37e-12 102-112 
IPB001781 11.42 2.04e-10 173-183 
IPB001781 1 1.42 4.60e-09 43-53 
IPB001781 1 1.42 7.90e-09 231-241 


896 


EPB003452 


Stem cell factor 


IPB003452C 13.68 9.29e-09 525-558 
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897 


EPB00096I 


Protein kinase C-terminal domain 


IPB000961D 21.23 5.29e-29 512-553 


897 


IPB001772 


Kinase associated domain 1 


IPB001772B 18.27 4.79e-24 409-454 


897 


EPB001245 


Tyrosine kinase catalytic domain 


IPB001245B 21.68 2.80e-19 516-554 


897 


IPB000861 


PKN/rhophilin/rhotekin rho-binding 
repeat 


IPB000861G 13.73 9.60e-16 518-567 


897 


IPB000959 


POLO box duplicated region 


IPB000959C 23.49 8.03e-15 491-543 


897 


IPB003527 


MAP kinase 


IPB003527G 17.26 8.94e-15 586-623 
IPB001772E 24.88 2,25e-14 574-613 
IPB000961B 17.79 2.37e-14 412-443 
IPB001245A 22.45 6.88e-14 460-500 
IPB000961A 16.82 7.75e-14 355-389 
IPB001772C 20.66 9.62e-14 455-485 
IPB001772D 21.67 4.73e-13 523-562 
IPB000959B 15.68 3.18e-ll 444-484 
IPB001772A 13.64 5.22e-l I 353-384 
IPB003527D 21.53 6.02e-l 1 509-550 
IPB000861E 16.40 9.36e-ll 399-444 


DOT 


IPB000095 


PAK-box /P21 -Rho-binding 


IPB000095F 16.47 9.65e-i0 520-574 
IPB003527C 14.70 2.54e-09 452-500 
IPB000861D 13.61 2.99e-09 353-389 
IPB000961C 15.48 3. 45 e-09 467-501 


897 


PR00109 


Tyrosine kinase catalytic domain 
signature II 


PR00109B 1 1.07 3.81e-09 467-485 
IPB000959D 27.01 5.31e-09 567-619 
IPB000959A 7.12 7.62e-09 356-368 


898 


IPB000961 


Protein kinase C-terminal domain 


IPB000961D 21.23 5.29e-29 699-740 


898 


IPB001772 


Kinase associated domain 1 


IPB001772B 18.27 4.79e-24 596-641 


898 


IPB001245 


Tyrosine kinase catalytic domain 


IPB001245B 21.68 2.80e-19 703-741 


898 


IPB000861 


PKN/rhophilin/rhotekin rho-binding 
repeat 


IPB000861G 13.73 9.60e-16 705-754 


898 


IPB000959 


POLO box duplicated region 


IPB000959C 23.49 8.03e-15 678-730 


898 


IPB003527 


MAP kinase 


IPB003527G 17.26 8.94e-15 773-810 
IPB001772E 24.88 2.25e-14 761-800 
IPB000961B 17.79 2.37e-14 599-630 
IPB001245A 22.45 6.88e-14 647-687 
IPB000961A 16.82 7.75e-l 4 542-576 
IPB001772C 20.66 9.62e-14 642-672 
IPB001772D 21.67 4.73e-13 710-749 


898 


IPB003533 


Doublecortin 


IPB003533F 11.80 5.30e-l2 161-194 
IPB000959B 15.68 3.18e-ll 631-671 
IPB001772A 13.64 5.22e-ll 540-571 
IPB003527D 21.53 6.02e-ll 696-737 
IPB00086 1 E 1 6.40 9.36e- 1 1 586-63 1 


O AO 

898 


IPB000095 


PAK-box /P21-Rho-binding 


IPB000095F 16.47 9.65e-10 707-761 
IPB003527C 14.70 2.54e-09 639-687 
IPB000861D 13.61 2.99e-09 540-576 
IPB000961C 15.48 3.45e-09 654-688 


898 


PR00109 


Tyrosine kinase catalytic domain 
bigndiurc 11 


PR00109B 11.07 3.81e-09 654-672 
IPB000959D 27.01 5.31e-09 754-806 
IPB000959A 7.12 7.62e-09 543-555 
IPB003533E 7.28 8.25e-09 105-144 


900 


IPB001073 


Complement Clq protein 


IPB001073B 20.88 6.00e-26 131-165 
IPB001073A 22.14 4.48e-20 85-119 


900 


IPB000885 


Fibrillar collagen C-terminal domain 


IPB000885B 19.15 9.63e-20 54-107 


900 


IPB001442 


C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442A 26.12 4.27e-19 55-107 
IPB000885B 19.15 7.48e-19 60-113 
IPB000885A 1 1.46 1.97e-18 62-99 
IPB000885A 11.46 2.94e-18 68-105 
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900 


PR00007 


Complement CIQ domain signature 
III 


PR00007C 16.13 3.67e-18 199-220 
IPB001442A 26.12 l.lle-17 64-116 
PR00007A 20.64 1.84e-17 124-150 
IPB001442A 26.12 1.87e-l 7 70-122 
IPB000885B 19.15 5.39e-17 57-110 
IPB000885A 11.46 6.96e-17 65-102 
IPB000885B 19.15 8.87e-17 51- 


900 


IPB000817 


Prion protein 


IPB000817A 8.34 3.27e-09 51-93 
IPB000885A 11.46 3.66e-09 19-56 
IPB001442A 26.12 4.13e-09 12-64 
IPB000885B 19.15 4.19e-09 26-79 
IPB000885A 11.46 4.77e-09 86-123 
IPB001442A 26.12 4.83e-09 24-76 
IPB001442B 12.38 5.99e-09 37-57 
IPB001442A 26.12 6.17e-09 21-73 
IPB000885B 19.15 7.55e-09 36-89 
IPB001442B 12.38 7.57e-09 71-91 
IPB001442A 26.12 8.36e-09 9-61 
IPB001442B 12.38 8.54e-09 89-109 
IPB001073A 22.14 8.59e-09 30-64 
IPB000885B 19.15 8.69e-09 78-131 
IPB001442B 12.38 9.64e-09 74-94 


901 


IPB000074 


Apolipoprotein A1/A4/E 


IPB000074A 1 1 .45 9.84e-09 7-24 


902 


IPB002360 


Involucrin 


IPB002360C 15.36 3.06e-14 407-448 


902 


PR00209 


Alpha/beta gliadin family signature II 


PR00209B 4.73 5.94e-12 427-445 


902 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D2.13 8.67e-ll 183-207 
IPB000135D 2.l3 2.96e-10 184-208 


902 


IPB001580 


Calreticulin family 


IPB001580F 2.93 4.94e-10 189-198 
IPB001580F 2.93 4.94e-10 190-199 
IPB001580F 2.93 4.94e-10 191-200 
IPB002360C 15.36 5.93e-10 416-457 
IPB000135D 2.13 7.46e-10 186-210 
IPB000135D2.13 7.46e-10 187-211 
IPB000135D 2.13 9.22e-10 185-209 
IPB002360C 15.36 2.50e-09 396-437 
IPB002360C 15.36 2.50e-09 415-456 
IPB000135D 2.13 3.55e-09 182-206 
IPB000135D 2.13 4.27e-09 188-212 
IPB000135D 2.13 4.91e-09 181-205 


902 


IPB001359 


Synapsin 


IPB001359H 22.58 5.19e-09 421-471 1 
IPB002360C 15.36 5.20e-09 404-445 




TDDAH 1 A OO 

lrouUI4zz 


Neuromodulin (GAP-43) 


IPB001422C 16.82 5.61e-09 184-219 
IPB002360C 15.36 5.70e-09 413-454 
IPB002360C 15.36 6.10e-09 389-430 


902 


IPB003753 


"Exonuclease VII, large subunit" 


IPB003753F 28.29 7.54e-09 382-432 
IPB002360C 15.36 8.80e-09 419-460 


905 
905 


IPB000483 
PR00019 


Leucine rich repeat C-terminal 
domain 

Leucine-rich repeat signature I 


IPB000483 11.18 5.50e-13 37-51 


906 


EPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


PR000I9A 11.72 9.33e-10 5-18 
IPB003006B 20.23 8.83e-ll 55-92 


908 


PR00457 


Animal haem peroxidase signature V 


PR00457E 19.97 8.45e-24 1041-1067 
PR00457D 18.35 1.53e-20 1016-1036 
PR00457C 18.81 9.42e-15 998-1016 
PR00457G 14.17 4.48e-14 1221-1241 
PR00457H 14.82 5.85e-13 1292-1306 
PR00457F 14.42 6.32e-12 1094-1104 
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908 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 1.00e-10 156-170 
PR00457B 12.43 2.29e-10 846-861 


908 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 2.80e-10 352-389 
IPB003006B 20.23 8.92e-10 448-485 
IPB003006B 20.23 9.28e-10 259-296 


909 


PR00457 


Animal haem peroxidase signature V 


PR00457E 19.97 8;45e-24 1072-1098 
PR00457D 18.35 i.53e-20 1047-1067 
PR00457C 18.81 9.42e-15 1029-1047 
PR00457G 14.17 4.48e-14 1252-1272 
PR00457H 14.82 5.85e-13 1323-1337 
PR00457F 14.42 6.32e-12 1125-1135 


909 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 1.00e-10 187-201 
PR00457B 12.43 2.29e-10 877-892 


909 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 2.80e-10 383-420 
IPB003006B 20.23 8.92e-10 479-516 
IPB003006B 20.23 9.28e-10 290-327 


910 


PR00457 


Animal haem peroxidase signature V 


PR00457E 19.97 8.45e-24 934-960 
PR00457D 18.35 1.53e-20 909-929 
PR00457C 18.81 9.42e-15 891-909 
PR00457G 14.17 4.48e-14 1114-1134 
PR00457H 14.82 5.85e-13 1185-1199 
PR00457F 14.42 6.32e-12 987-997 
PR00457B 12.43 2.29e-10 739-754 


01 A 


1PB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 2.80e-10 329-366 
IPB003006B 20.23 8.92e-10 425-462 
IPB003006B 20.23 9.28e-10 236-273 


910 


PR00019 


Leucine-rich repeat signature II 


PR00019B 1 1.42 6.73e-09 73-86 


Q1 1 


nn Ann t A 


Type II EGF-hke signature I 


PR00010A 12.91 7.75e-13 43-54 


911 


IPB001862 


Membrane attack complex 
components/perforin/complement C9 


IPB001862F 29.39 5.45e-12 925-972 
IPB001862F 29.39 7.21e-12 559-606 


911 


1PB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 7.48e-12 117-132 
PR00010A 12.91 1.00e-ll 102-113 
PR00010A 12.91 4.27e-ll 168-179 


911 


IPB000561 


EGF-like domain 


IPB000561 4.89 8.00e-ll 88-96 
IPB001862F 29.39 8.70e-ll 846-893 


911 


IPB000359 


Cystine- knot domain 


IPB000359A 23.24 2.7le-10 843-867 
PR00010C6.98 3.61e-10 122-132 


911 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881A 8.72 4.21e-10 574-583 j 
IPB000152 8.86 4.66e-l0 183-198 
IPB000561 4.89 4.75e-10 126-134 
IPB001881A 8.72 6.79e-10 940-949 
PR00010C 6.98 7.10e-10 877-887 
PR00010C 6.98 7.68e-l0 230-240 
PR00010C 6.98 1.22e-09 590-600 


911 


PR00764 


Complement C9 signature VI 


PR00764F 15.74 1.34e-09 942-962 
IPB001881B 12.28 1.78e-09 183-194 
PR00764F 15.74 3.28e-09 576-596 
IPB000359A 23.24 3.35e-09 88-112 
IPB000561 4.89 4.21e-09 881-889 
IPB000152 8.86 4.79e-09 834-849 
IPB001862F 29.39 6.23e-09 91-138 
IPB001881A 8.72 7.00e-09 106-115 
IPB001881B 12.28 7.65e-09 117-128 
PR00010C 6.98 7.80e-09 84-94 


911 


IPB000033 


"Low-density lipoprotein (Idl) 
receptor, YWTD repeat" 


IPB000033B 7.05 8.34e-09 877-887 
IPB000152 8.86 8.58e-09 872-887 


911 


IPB003884 


Factor I membrane attack complex 


IPB003884F 16.26 8.7le-09 177-192 
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7lJ 




Kinesin light chain repeat 


IPB002151B 14.23 8. Ole- 10 240-292 


913 


IPB000421 


Coagulation factor 5/8 type C 
domain (FA58C) 


IPB000421A 21.21 7.85e-09 43-62 


913 


IPB002360 


Involucrin 


IPB002360C 15.36 8.00e-09 373-414 


914 


IPB003117 


Regulatory subunit of type II PKA R- 
subunit 


IPB003117C 17.01 1.00e-40 147-187 
IPB003117D 18.87 1.00e-40 198-238 
IPB003117G 17.45 8.50e-33 341-375 
IPB0031 17A 22.23 5.50e-26 24-56 
IPB003117E 18.84 5.85e-23 287-315 


914 


IPB000595 


Cyclic nucleotide-binding domain 


IPB000595C 23.31 6.82e-21 321-346 


914 


PR00103 


cAMP-dependent protein kinase 
signature II 


PR00103B 10.32 7.00e-18 173-187 
IPB000595B 15.72 7.50e-18 279-302 
IPB003117F 17.26 l.OOe- 17 323-337 
IPB000595B 15.72 4.43e-16 161-184 
PR00103A 9.07 7.75e-16 158-172 
IPB003117C 17.01 2.96e-15 265-305 
IPB003117D 18.874.14e-15 322-362 
PR00103E 12.91 5.91e- 14 355-367 
PR00103D 10.18 2.93e-13 334-345 
IPB000595C 23.31 4.60e-13 197-222 
PR00103C 13.28 i.84e-ll 322-331 
PR00103D 10.18 2.98e-10 210-221 
IPB003117E 18.84 3.57e-10 157-185 
IPB003117E 18.84 5.43e-10 275-303 
IPB003117F 17.26 1.50e-09 199-213 
PR00103A 9.07 8.1 le-09 276-290 


915 


IPB001478 


PDZ domain (also known as DHR or 
GLGF) 


IPB001478B 6. 12 4.94e-09 602-611 


916 


IPB000907 


Lipoxygenase 


IPB000907J 20.31 5.50e-37 521-563 
IPB000907G 22.23 1.87e-34 371-413 
IPB000907F 21.29 1.00e-28 338-370 
IPB0009071 27.52 9.79e-28 460-513 


916 


PR00467 


Mammalian lipoxygenase signature 
VI 


PR00467F 12.25 9.41e-22 418-440 


916 


PR00087 


Lipoxygenase signature III 


PR00087C 13.32 1.39e-21 373-393 
IPB000907C 16.09 7.17e-21 221-247 
IPB0O09O7E 15.16 1.00e-18 296-320 
PR00467E 9.17 2.10e-17 293-312 
PR00467D 17.16 9.57e-17 196-217 
IPB000907D 18.70 2.67e-16 262-289 
PR00087A 20.06 3.52e-15 335-352 
PR00087B 13.69 5.11e-l5 353-370 
IPB000907B 14.10 2.50e-13 160-175 
PR00467A 8.38 3.29e-13 11-28 
IPB000907H 18.37 5.86e-13 434-450 
PR00467B 14.98 5.88e-12 57-76 
PR00467G 16.61 3.37e-ll 576-593 
IPB000907A 16.20 4.21e-l0 94-103 
PR00467C9.34 7.65e-10 134-148 


917 


IPB000907 


Lipoxygenase 


IPB000907C 16.09 7.17e-21 194-220 i 
IPB000907E 15.16 1.00e-18 269-293 


917 


PR00467 


Mammalian lipoxygenase signature 
V 


PR00467E 9.17 2.10e-17 266-285 
PR00467D 17.16 9.57e-17 169-190 
IPB000907D 18.70 2.67e-16 235-262 
IPB000907B 14.10 2.50e-13 131-146 
PR00467A 8.38 3.29e-13 11-28 
PR00467B 14.98 5.88e-12 57-76 
IPB000907A 16.20 4.21e-10 94-103 
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QIC 


Iri5u00907 


Lipoxygenase 


IPB000907C 16.09 7. 17e-21 223-249 
IPB000907E 15.16 1.00e-18 298-322 


918 


rRU0467 


Mammalian lipoxygenase signature 
V 


PR00467E 9.17 2.10e-17 295-314 
PR00467D 17.16 9.57e-17 198-219 
IPB000907D 18.70 2.67e-16 264-291 
IPB000907B 14.10 2.50e-l3 160-175 
PR00467A 8.38 3.29e-13 11-28 
PR00467B 14.98 5.88e-12 57-76 
IPB000907A 16.20 4.21e-10 94-103 
PR00467C 9.34 7.65e-10 134-148 


097 


IPROH1774 
lr DuU 1 / /*f 


~~r\ t 7* j 

Delta serrate hgand 


IPB001774C 18.25 1.71e-31 37-79 
IPB001774D 19.23 3.32e-25 83-129 


927 


PR00011 


Type III EGF-like signature IV 


PR00011D 12.12 4.57e-12 39-57 


Q97 


ixDUUUl DZ 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 L00e-10 189-204 
IPB001774C 18.25 2.15e-10 68-110 


927 


PR00010 


Type II EGF-like signature III 


PR00010C 6.98 3.90e-10 113-123 


yz/ 




Cystine-knot domain 


IPB000359A 23.24 4.86e-10 160-184 


927 


IPB000034 


Laminin B 


IPB000034C 12.97 6.42e-l0 236-254 
PR000UB 13.08 7.88e-10 39-57 


927 


IPB000561 


EGF-like domain 


IPB000561 4.89 9.25e-10 46-54 


927 


IPB001886 


Laminin N-terminaf (Domain VI) 


IPB001886E 10.90 9.67e-10 44-60 
PR00010A 12.91 1.27e-09 174-185 
PR00010C 6.98 2.54e-09 194-204 


927 


IPB001862 


Membrane attack complex 
components/perforin/complement C9 


IPB001862F 29.39 2.65e-09 201-248 
IPB000152 8.86 6.21e-09 108-123 
PR0001 1 A 14.05 6.88e-09 39-57 


Q97 


r ivU I Z I / 


— — — ; : 

Proline nch extensin signature VII 


PR01217G 4.02 7.79e-09 252-277 
IPB001862F29.39 8.53e-09 163-210 
IPB000034A 22.21 9.00e-09 96-131 
IPB000152 8.86 9.29e-09 227-242 


927 


IPB001762 


Disintegrin 


IPB001762A 23.93 9.65e-09 126-166 


Q9R 




Ribosomal protein P2 signature V 


PR00456E 3.08 7.80e-09 1-15 


930 


IPB001248 


"Permeases for cytosine/purines, 
uracil, thiamine, allantoin" 


IPB001248A 28.27 5.94e-10 238-273 


930 


IPB000390 


"Integral membrane protein, DUF7" 


IPB000390B 26.91 6.96e- 10 217-271 


931 


IPB001359 


Synapsin 


IPB001359H 22.58 9.63e-l0 47-97 


932 


PR00336 


Lysosome-associated membrane 
glycoprotein signature IV 


PR00336D 10.26 5.99e-09 2-24 


933 


IPB002467 


"Methionine aminopeptidase, 
subfamily 1" 


IPB002467C 17.56 2.29e-30 169-197 
IPB002467B 12.68 2.50e-23 143-164 
IPB002467F 18.38 l.71e-21 299-329 


933 


PR00599 


Methionine aminopeptidase-1 
signature II 


PR00599B 10.21 8.00e-17 173-189 
IPB002467D 14.78 5.50e-15 242-267 
PR00599A 1 1.84 9.63e-14 151-164 1 
IPB002467E 11.05 7.75e-12 275-287 
PR00599D 14.43 5.03e-10 273-285 
IPB002467A 15.75 2.87e-09 115-132 


933 


IPB001131 


Proline dioeDtidase 


TPROA1 1 1 1Fi 11 ^ 1 Q*» AO 7*7C ooo 

iFovA/iijiu i i.jo j. i©e-uy z/ j-Zoo 
IPB001131B 18.96 8.10e-09 173-194 


934 


IPB001463 


Sodium:aIanine symporter family 


IPB001463A 16.70 5.87e-09 174-224 


938 


DPB001478 


PDZ domain (also known as DHR or 
GLGF) 


IPB001478A 11.55 5.09e-09 119-129 
IPB001478B 6.12 1.00e-08 137-146 


940 


PRO 1286 


Orphan nuclear receptor NOR1 
signature V 


PR01286E 5.27 9.26e-09 307-328 


941 


IPB000998 


MAM domain 


IPB000998D 18.66 1.96e-15 527-550 


941 


IPB003886 


Extracellular domain in nidopen 


IPB003886D 13.91 8.77e-15 237-256 


941 


IPB000152 


Aspartic acid and asparagine 


IPB000152 8.86 2.89e-14 1 10-125 



WO 2004/080148 PCT/US2003/030720 



391 
TABLE 3B 



941 


IPB001881 


hydroxylation site 
Calcium-binding EGF-like domain 


TPB0018R1R 10 oq c AA a u im oni 
lrowiooiD iz.zo j.00e-14 192-203 

TPB0001S9 fiS/: i nn^ io nco 
irouuui^z 5.00 l.OOe- 13 237-252 

IPB000152 8.86 1.82e-13 192-207 

IPB0018R1R noo/i oc A n 1 in 101 
n uwioo io iz.Zo 4. /Oe-1 5 110-1/1 


941 


IPB001774 


Delta serrate ligand 


IPB001 774P IRO^onfl mo ii/i 
IPB000998R 17 90 i nnn io am aii 


941 


PR00020 


MAM domain signature I 


PR00020A 20 45? 9 rRp i 1 40r ai& 
IPB000998C 18 S lOe-1 1 464-47Q 
IPB001881B 12.28 8.58e-ll 237-248 


941 


PR00907 


Thrombomodulin signature II 


PR00907B 1 1 50 ? ddr-i n i aa.i m 


941 


IPB000561 


EGF-like domain 


IPB000561 4.89 3.25e-10 81-89 


941 


IPB000033 


"Low-density lipoprotein (ldl) 
receptor, YWTD repeat" 


IPB0O0033B 7.05 5.35e-10 242-252 
IPB000033B 7.05 5.97e-09 197-207 


941 


IPB000167 


Dehydrin 


FPB000167A R SR 7 \Ae* no io/i ici 


941 


IPB003367 


Thrombospondin type 3 repeat 


IPB003367A 11.78 9.79e-09 159-179 


942 


IPB000998 


MAM domain 


tprooooorf* is i o<« k cm ccc 

LrDWvyyou 15.00 i.yoe- 13 032-555 


942 


IPB0O3886 


Extracellular domain in nidogen 


IPB003886D 13.91 8.77e-15 242-261 


942 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 2.89e-14 1 15-130 


942 


IPB0O1881 


Calcium-bindine EGF-like domain 


rpROOISHIR IO OQ c nn« i a inn ono 

irrJUi/ioolrJ Iz.zo j.00e-14 197-208 
rpROOOl S9 R R£ i O0o 1 i o/io o^o 

i-ruuuvijz o.ou i.UUe-lj z4a-Z j / 

IPB000152 8.86 1.82e-13 197-212 
IPB00 1 88 1 R 19 9R A 7^* 11 11^ 1 0A 


942 


1PB001774 


Delta serrate ligand 


IPB001774C 18.25 9.13e-13 77-119 
IPB000998B 17.20 l.OOe- 12 415-427 


942 


PR00020 


MAM domain signature I 


PR00090A 90 4R 9 RR/» 1 1 A11 /111 
IPBOOOQQRP 1 R fil S 1 1 A&Q AQA 

IPB001881B 12 28 8 58e-1 1 949.9<;i 


942 


PR00907 


Thrombomodulin signature II 


PR00907B 1 1 50 *> 44e- 1 0 1 40- 1 6* ~1 


942 


IPB000561 


EGF-like domain 


IPB000561 4 89 3 25e-10 R6-Q4 


942 


IPB000033 


"Low-density lipoprotein (ldl) 
receptor, YWTD repeat" 


IPB000033B 7.05 5.35e-10 247-257 j 


942 


PRO 1256 


Utxl transcription factor signature II 


PR01256B 5.92 2.01e-09 23-35 
IPB000033B 7 05 5 97e-09 909-919 
PR01256B 5.92 6.46e-09 24-36 


942 


IPB000167 


Dehydrin 


IPB000167A 8.58 7.14e-09 329-356 


942 


IPB003367 


Thrombospondin type 3 repeat 


IPB003367A 11 78 9 79e-0Q 164-1R4 


943 


IPB002893 


MYND zinc finger (ZnF) domain 


IPB002893 16 28 4 52e-17 Q86-1004 


943 


IPB000313 


PWWP domain 


IPB000313A 8.15 6.88e-15 276-290 


943 


IPB001487 


Bromodomain 


IPB001487B 17 44 1 19p-11 909 991 
IPB001487A 11.44 9.33e-12 178-196 


"943 


IPB002219 


Phorbol esters/diacylglycerol binding 
domain 


IPB00991QR 19 SI S 14** 10 0/1 too 


943 


PR00503 


Bromodomain signature II 


PR0050^R 10 44 7 1Rp-OQ 177 101 


943 


IPB002889 


WSC domain 


IPB00988QP 0 RQ R 19p OO 7£9 7C1 

IPB002889B 1 1.76 9.91e-09 744-790 


944 


IPB000313 


PWWP domain 


IPB000313A 8.15 6.88e-15 276-290 


C\A A 

944 


IPB001487 


Bromodomain 


IPB001487B 17.44 1.32e-13 202-223 
IPB001487A 11.44 9.33e-12 178-196 


"944 


IPB002219 


Phorbol esters/diacylglycerol binding 
domain 


IPB002219B 12.53 5.14e-10 94-109 


944 


PR00503 


Bromodomain signature II 


PR00503B 10.44 7.38e-09 177-193 


945 


IPB002893 


MYND zinc finger (ZnF) domain 


IPB002893 16.28 4.52e-17 1032-1050 


945 


IPB000313 


PWWP domain 


IPB000313A 8.15 6.88e-15 276-290 


945 


IPB001487 


Bromodomain 


IPB001487B 17.44 1.32e-13 202-223 
IPB001487A 11.44 9.33e-12 178-196 
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945 


IPB002219 


Phorbol esters/diacylglycerol binding 

uOiUain 


IPB002219B 12.53 5.14e-10 94-109 


945 


PROOSO^ 

l £V.UUJVJ 


orornoaomain signature 11 


FKUUOUJb 10.44 7.38e-09 177-193 


945 


IPB002889 


WSC domain 


IPB002889C 9.89 8.12e-09 762-783 
1PB002889B 11.76 9.91e-09 744-790 


946 




ivi x iNU zinc nnger ^jlut ) domain 


lFBOOZoyj 16.28 4.52e-17 1037-1055 


946 


IPB000313 


PWWP domain 


IPB000313A 8.15 6.88e-15 281-295 


QAfi 


TPRfifi 1/187 


Bromodomain 


IPB001487B 17.44 1.32e-13 207-228 ! 
IPB001487A 11.44 9.33e-12 183-201 


y*o 




-rr — — rr. r—j 

Phorbol esters/diacylglycerol binding 
uomam 


IPB002219B 12.53 5.14e-10 99-114 


946 


PR00503 


Bromodomain signature II 


PR00503B 10.44 7.38e-09 182-198 


946 


IPB002889 


WSC domain 


IPB002889C 9.89 8. 12e-09 767-788 
IPB002889B 11.76 9.91e-09 749-795 


950 


PR00169 


Potassium channel signature VII 


PR00169G 11.30 5.96e-ll 467-489 


950 


r»r»/\i o oo 

PRO 13 33 


T* J * TV it f 

Two pore domain K+ channel 
signature I 


PR01333A 18.74 7.08e-10 479-507 
PR01333B 10.39 5. 95e-09 482-491 




rKOOiOo 


Connexin signature VI 


PR00206F 15.67 6.01e-09 498-521 


951 


IPB001762 


Disintegrin 


IPB001762A 23.93 4.33e-23 441-481 


951 


IPB002870 


Reprolysin family propeptide 


IPB002870B 24.73 3.54e-20 114-152 


AC 1 

951 


PR00289 


Disintegrin signature I 


PR00289A 14.29 1.16e- 14 457-476 
IPB002870F 18.81 3.03e- 14 385-409 
IPB002870E 1 1.90 2.46e-12 344-356 
IPB001762B 10.06 3.40e-12 488-498 
IPB001762A 23.93 9.20e-ll 409-449 


951 


IPB000130 


"Neutral zinc metallopeptidases, 
zinc-binding region" 


IPB000130 5.86 1.56e-10 342-352 


I 

yj I 


rKUOUo 


Matrixin signature IV 


PR00138D 14.57 2.54e-10 342-367 
IPB002870D 16.31 4.77e-10 310-325 


951 


PR00480 


Astacin family signature II 


PR00480B 14.35 5.57e-10 337-355 


951 


PR00436 


Interleukin-8 signature I 


PR00436A 15.20 7.43e-10 5-28 


951 


IPB001818 


Matrixin 


IPB001818D 14.91 1.72e-09 336-367 
PR00289B 11.74 3. 80e-09 486-498 
IPB002870A 12.22 6.54e-09 68-84 


951 


PRO 1 236 


Tumour necrosis factor beta 
(lymphotoxin-alpha) signature I 


PR01236A 4.92 7.49e-09 17-33 
IPB002870C 11.01 9.64e-09 278-288 


953 


IPB000906 


ZU5 domain 


IPB000906E 22.1 1 5.55e-l 1 248-288 


953 


PR01415 


Ankyrin repeat signature I 


PR01415A 12.73 6.46e-ll 251-263 
IPB000906D 23.89 6.59e-ll 316-370 
PR014I5A 12.73 7.11e-ll 184-196 
PR01415A 12.73 7.43e-ll 152-164 
IPB000906F 35.93 5.85e-10 194-247 
PR01415B 10.23 5.88e-09263-275 
IPB000906G 25.85 6.69e-09 330-378 


953 


PR00898 


Vasopressin V2 receptor signature II 


PR00898B 4.91 7.69e-09 46-60 
IPB000906A 22.49 7.84e-09 177-219 


954 


IPB000471 


"Interferon alpha, beta and delta 
iamiiy 


IPB000471A 27.36 3.61e-32 45-98 


954 


PR00266 


Interferon alpha and beta subunit 
signature I 


PR00266A 13.41 9.59e-14 67-79 


955 


PR01136 


Gap junction alpha-6 protein (Cx45) 
signature I 


PR01 136A 6.68 5.05e-09 203-209 


956 


PR00081 ! 


Glucose/ribitol dehydrogenase family 
signature VI 


PR00081F 13.94 5.50e-13 152-172 
PR00081A 10.07 5.67e-13 34-51 
PR00081B 8.91 5.66e-ll 108-119 


956 


PR01397 


"2,3-dihydro-2,3-dihydroxybenzoate 
dehydrogenase signature VT 


PR01397F 12.91 9.53e-ll 168-187 
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956 


PR00080 


Short-chain dehydrogenase/reductase 
(SDR^ suoerfamilv signature I 


PR00080A 7.98 3.73e-09 108-119 
PRO 1 3Q7 A 1 1 11 A no io c< 


958 


IPB000560 


Histidine acid phosphatase 


IPB000560 17.02 7.55e-13 30-52 


958 


PR00885 


Bacterial general secretion pathway 
protein H signature II 


PR00885B 8.16 9.14e-10 394-408 


958 


PR01319 


VJllai IsC/U UiiC/ VJCl 1 VCU HCUi Ull UUIHC 

factor receptor alpha 3 signature I 


rtwji jiyA j.iSD j.yJe-Oy 10-22 


959 


IPB000215 


Serpins 


IPB000215D 15.35 7.00e-22 224-250 
IPB000215E 15.36 6.06e-18 305-329 
IPB000215C 13.90 4.75e-17 122-136 

TPRAAA 1 ?! ^"R O Q*7 1 Q A~ i*i nc mo 
irDKJVVZLJD y.o / J.o4e-12 95-107 


960 


IPB000215 


Serpins 


IPB000215D 15.35 7.00e-22 292-318 
IPB000215A 13.01 4.18e-20 73-96 
IPB000215E 15.36 6.06e-18 373-397 

TPQAAAO 1 1 "5 fiA c oo tit r\n r\r\ a 

lrtJUUUzlDC U.yo 5.82e-ll 190-204 


961 


IPB000215 


£5#*rriitic 


TPQAAAOI ?n 1 C OC —t a A_ orv» 110 

iroUUUzlDJJ 15.35 7.00e-22 292-3 1 8 
IPB000215A 13.01 4.18e-20 73-96 
IPB000215E 15.36 6.06e-18 373-397 j 

TPRAAAO 1 Kf~* 11 OA A *7C« 1 *7 1 OA on >i 

lrDUuuziDL, I J.yu 4. /5e-l / iyo-204 
IPB000215B 9.87 3.84e-12 163-175 


962 


IPB000215 




tpraaao i <\ a n ni a i o„ on ni <\c 

irDUUUZiDA 1J.U1 4. loe-ZU /j-yo 

IPB000215E 15.36 6.06e-18 373-397 
IPB000215C 13.90 4.75e-17 208-222 
IPB000215B 9.87 3.84e-12 181-193 


964 


IPB001762 


Disintegrin 


IPB001762A 23.93 4.33e-23 457-497 


964 


IPB002870 


Reprolysin family propeptide 


IPB002870F 18.81 2.35e- 19 402-426 

TPRAAOCTAC 1 1 on o 1H~. i c me. mo 

IPB002870B 24.73 8.16e-16 145-183 | 


964 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D 2.13 8.05e-14 789-813 


964 


PR00289 


Disintegrin signature I 


PR00289A 14.29 2.80e-13 473-492 
IPB000135D 2.13 6.08e-13 788-812 
IPB000135D 2.13 9.08e-13 785-809 
IPB000135D 2.13 2.30e-12 786-810 
IPB000135D 2.13 6.10e-12 787-811 
IPB0O0135D 2.13 6.75e-12 790-814 


964 


xx xJ\j\JlJO\j 


v^aiicucuiin iamiiy 


TDDAm cone? o no c ca. i 1 ic\a o r\*> 

1PB001580F 2.93 5.50e-ll 794-803 
IPB002870A 12.22 8.80e-ll 100-116 
IPB000135D 2.13 3.64e-10 783-807 

I DDAH 1 7<1D 1 A tsc A 1 n C A/f C 1 A 

IrrJUUl /oZa iu.Oo 4.ooe-IU 504-514 

TPR AA 1 CO AT? O A1 A CIA « 1 A OA 1 O 1 A 

IroUUljoUr Z.yj 4.y4e-IU oUl-olO 
TP Flfi0 1 ^8 AT? O Q1 A QA*% 1 A CAO Oil 

lrDUUljoUr Z.yj 4.y4e-iu oUZ-oii 
irDuuuijjjLi z. u o.uye-iu /o4-ouo 

IPB001580F2.93 1.00e-09 798-807 


964 


rPB000130 


"Neutral zinc metallopeptidases, 
zinc-binding region" 


IPB000130 5.86 1.86e-09 364-374 
PR00289B 11.74 1.89e-09 502-514 
IPB002870C 11.01 3.16e-09 300-310 


964 


DPB003191 


Guanylate-binding protein 


IPB003191N 9.33 3.37e-09 779-809 


964 


PR00480 


Astacin family signature II 


PR00480B 14.35 3.45e-09 359-377 


964 


IPB001422 


Neuromodulin (GAP-43) 


IPB001422C 16.82 4.49e-09 777-812 


965 


IPB000329 


Uteroglobin family 


IPB000329A 11.99 3.57e-10 1-16 


965 


PR00486 


Uteroglobin signature I 


PR00486A 6.53 9.03e-09 2-16 


966 


IPB000407 


GDA1/CD39 family of nucleoside 
phosphatase 


IPB000407C 15.11 5.50e-24 175-197 
IPB000407D 1 1.44 2. 16e-14 216-229 
IPB000407B 8.75 3.86e-13 132-143 
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IPB001073 
PR00007 
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Complement Clq protein 



IPB000407F 16.53 3.89e-12 422-436 
IPB000407A 11.93 5.30e-12 56-67 
IPB000407E 19.08 8.20e-l 1 342-358 
IPB000407G 17.95 8.20 e-1 1 455-469 



IPB001073B 20.88 5.78e-23 96-130 
IPB001073C13.07 4.50e-13 163-182 
IPB001073A 22. 14 6.55e-13 42-76 



Complement C1Q domain signature 
II 



PR00007B 15.63 9.56e-13 116-135 
IPB001073D7.60 1.00e-ll 195-204 
PR00007D9.66 2.00e-ll 193-203 
PR00007C 16.13 7.38e-ll 163-184 
PR00007A 20.64 9.32e-10 89-1 15 



970 



970 



971 



971 



973 



973 



IPB000721 



IPB003006 



Gag gene protein p24 (core 
nucleocapsid protein) 



IPB000721E 14.33 1.57*12 525-538 



IPB001020 



IPB001759 



PR00895 



IPB002889 



IPB001871 



Immunoglobulin and major 
histocompatibility complex domain 



Histidine phosphorylation site in HPr 
protein 



IPB003006B 20.23 6.09e-ll 206-243 
IPB003006A 17.51 1.00e-10 160-182 
IPB001020B 19.38 4.53e-09 378-416 



Pentaxin family 



Pentaxin signature V 



IPB001759D 18.25 4.67e-33 409-447 



WSC domain 



PR00895E 12.84 4. 19e- 18 417-436 
PR00895D 14.46 2.38e-17 397-416 
PR00895C 12.82 3. 18e- 17 370-388 
IPB001759C 13.49 4.30e-17 370-388 
IPB001759A 29.51 1.82e-14 1 13-147 
PR00895A 14.28 8.83e-13 305-319 
PR00895B 14.42 1. 45e- 12 327-341 
IPB001759B 14.85 3.30e-ll 327-341 
IPB001759E 18.14 5.34e-U 459-473 
PR00895F 15.89 9.50e-ll 436-450 



bZIP (Basic-leucine zipper) 
transcription factor family 



IPB002889B 
IPB002889B 
IPB002889B 



11.76 5.15e-13 453-499 
11.76 1.55e-12 445-491 
11.76 4. 18e-12 458-504 



Jun transcription factor signature II 
Calcium-activated BK potassium 
channel alpha subunit signature VIII 



IPB001871 8 
IPB002889B 
IPB002889B 
IPB002889B 
IPB002889B 
IPB002889B 
IPB002889B 
IPB002889B 



.42 8.65e- 12 633-645 
11.76 8.79e-12 447-493 
1 1.76 9.89e-12 440-486 
11.76 2.59e-l I 439-485 
11.76 4.49e-ll 441-487 
11.76 5. 13e-l 1454-500 
11.76 5.87e- 11 437-483 
11.76 6.72e-ll 448-494 



973 



973 



973 



PR00043 



PRO 1449 



PR00043B 8.71 8.92e-ll 633-649 



IPB002546 



Myogenic Basic domain 



PR01449H2 
IPB002889B 
IPB002889B 
IPB002889B 
IPB002889B 
IPB002889B 



IPB000684 



Eukaryotic RNA polymerase II 
heptapeptide repeat 



IPB002546E 
IPB002889B 
IPB002889B 
IPB002889B 



34 9.85e- 11 468-483 
11.76 2. 19e-10 449-495 
11.76 2.58e-10 443-489 
1 1.76 3.87e-10 456-502 
11.76 4.46e- 10 452-498 
11.76 6.44e- 10 444-490 



IPB000684L 
IPB002889C 
PR01449H 2. 
PR01449H 2 
PR01449H 2 
PR01449H 2. 
PR01449H 2 



13.48 9.04e- 10 464-481 
11.76 9.41e-10 457-503 
11.76 1.00e-09 461-507 
11.76 1.28e-09 436-482 



3.49 2.10e-09 445-487 
9.89 2.21e-09 466-487 
.34 2.50e-09 469-484 
.34 2.50e-09 472-487 
.34 2.59e-09 466-481 
.34 2.59e-09 467-482 
.34 3.03e-09 463-478 
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IPB002889B 1 1.76 4.09e-09 438-484 
PR01449H 2.34 4.18e-09 461-476 
PR01449H 2.34 4.18e-09 464-479 
PR01449H 2.34 4.35e-09 473-488 
IPB002889B 1 1.76 4.47e-09 455-501 
PR01449H 2.34 4.53e-09 453-468 
IPB002889B 11.76 5.1 3e-09 442-488 
IPB002889B 11.76 5.3 le-09 431-477 
IPB002546E 13.48 5.50e-09 469-486 
IPB002889B 11.76 6. 62e-09 463-509 
IPB002889B 11.76 7.1 9e-09 462-508 
IPB002889B 11.76 8.69e-09 450-496 
IPB000684L 3.49 8.83e-09 447-489 


y/f 


TDD HA 1 o cn 


Synapsin 


IPB001359H 22.58 1.95e-15 545-595 


977 


DPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 3.00e-13 2087-2112 
IPB000822 14.67 1.86e- 11 2476-2501 
IPB001359H 22.58 4.46e-ll 539-589 
IPB000822 14.67 5.29e- 11 2362-2387 
IPB000822 14.67 6.57e-ll 472-497 
IPB000822 14.67 8.7ie-ll 2253-2278 


977 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 9.02e-ll 540-554 
PR00049D 0.00 9.17e-ll 541-555 


977 


IPB003861 


E4 protein 


IPB003861B 9.06 1.43e- 10 547-561 


Q77 

y 1 I 




Tudor domain 


IPB002999C 10.33 2.00e-10 546-555 
IPB000822 14.67 2.29e-10 110-135 
IPB001359H 22.58 2.67e-10 537-587 


977 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 3.45e-10 2473-2486 
IPB001359H 22.58 5.08e-10 551-601 
IPB00I359H 22.58 5.36e-10 541-591 


977 


PR01217 


Proline rich extensin signature VII 


PR01217G 4.02 6.94e-10 545-570 
IPB000822 14.67 i.00e-09 602-627 
PR00049D 0.00 2.98e-09 538-552 
PR00049D 0.00 3.29e-09 539-553 


977 


IPB002000 


Lysosome-associated membrane 
glycoprotein (Lamp) 


IPB002000D 5.87 3.72e-09 192-205 
PR00049D 0.00 3.90e-09 543-557 


977 


IPB000413 


Integrins alpha chain 


IPB000413A 13.51 4.33e-09 1509-1519 
IPB001359H 22.58 4.41e-09 547-597 
PR00049D 0.00 4.8 le-09 537-55 1 


977 


PR00021 


Small proline-rich protein signature I 


PR00021A 3.31 5.38e-09 538-550 
IPB000822 14.67 5.50e-09 1894-1919 
IPB000822 14.67 5.88e-09 1579-1604 
IPB001359H 22.58 5.89e-09 543-593 
PR00049D 0.00 6.03e-09 191-205 
IPB000822 14.67 6.62e-09 1662-1687 


977 


PR00239 


Molluscan rhodopsin C-terminal tail 
signature V 


PR00239E 1.29 6.97e-09 542-553 
IPB000822 14.67 7.00e-09 2053-2078 
IPB001359H 22.58 7.03e-09 546-596 
rKUUU4ob 5.52 7.50e-09 2 1 00-21 09 
IPB002999B 7.50 7.55e-09 545-553 
IPB002999B 7.50 7.55e-09 546-554 
IPB000822 14.67 8.12e-09 2116-2141 
IPB000822 14.67 8.50e-09 1267-1292 


977 


PR00776 


Hemoglobinase (C13) cysteine 
protease signature IV 


PR00776D 11.72 8.62e-09 2447-2466 
IPB001359H 22.58 8.95e-09 558-608 
IPB002000D 5.87 9.49e-09 542-555 


977 


PR00211 


Glutelin signature II 


PR00211B 0.86 9.92e-09 551-571 
IPB000822 14.67 1.00e-08 1032-1057 
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I 980 


PR00834 


HtrA/DegQ protease family signature 
III 


PR00834C 15.48 6.8 le-20 237-261 
PR00834D 11.75 9.45e-18 275-292 


980 


IPB002350 


Kazal-type serine protease inhibitor 
family 


IPB002350 31.78 6.52e-17 73-113 
PR00834B 10.17 6.63e-14 196-216 
PR00834E 13.43 9.13e-13 297-314 


980 


IPB000867 


Insulin-like growth factor-binding 
protein 


IPB000867B 11.44 1.94e- 12 23-39 


980 


IPB000126 


"Serine proteases, V8 family" 


IPB000126B 12.50 3.32e- 12 280-296 
PR00834F 11.11 3.25e-ll 389-401 
PR00834A 8.79 5.83e-ll 175-187 
IPB000126A 11.75 5,69e-10 173-188 


980 


PR00290 


Kazal-type serine protease inhibitor 
signature II 


PR00290B 16.63 2.80e-09 84-95 


980 


PR00722 


Chymotrypsin serine protease family 
(SI) signature III 


PR00722C 10.74 4. iOe-09 283-295 


980 


PRO 1424 


Transforming growth factor beta 1 
precursor signature I 


PR01424A 6.58 8.24e-09 8-27 


980 


IPB001489 


Heat-stable enterotoxin 


IPB001489 13.51 8.78e-09 26-38 


981 


PR00792 


Pepsin (Al) aspartic protease family 
signature I 


PR00792A 11.02 5.32e-17 80-100 


981 


IPB001969 


Eukaryotic and viral aspartic protease 
active site 


IPB001969A 16.37 5.15e-13 87-103 
PR00792D 11.77 I. OOe- 12 395-410 
PR00792C 8.65 6.29e-12 312-323 
IPB001969A 16.37 7.00e-10 310-326 


982 


IPB000917 


Suifatase 


IPB000917A 9.52 5.26e-10 44-55 


984 


IPB000834 


"Zinc carboxypeptidases, 
carboxypeptidase A metalloprotease 
(M14) family" 


IPB000834B 13.51 2.50e-17 103-117 


984 


PR00765 


Carboxypeptidase A metalloprotease 
(M14) family signature II 


PR00765B 14.48 1.39e-15 99-113 
IPB000834C 17 20 2 80e-15 172-188 
IPB000834G 14.46 4.50e-15 318-333 
IPB000834D 18.95 4.72e-I2 199-225 
PR00765D 14.06 9.45e-12 233-246 
PR00765C 10.88 1.82e-10 179-187 
IPB000834F 12.40 4.21e-10 285-297 
IPB000834E 9.80 2.15e-09 228-242 


985 


IPB000834 


"Zinc carboxypeptidases, 
carboxypeptidase A metalloprotease 
(M14) family" 


IPB000834B 13.51 2.50e-l7 103-117 


985 


PR00765 


Carboxypeptidase A metalloprotease 
(Ml 4) family signature II 


PR00765B 14.48 1.39e-15 99-113 ~~ 
IPB000834C 17.20 2.80e-15 172-188 
IPB000834G 14.46 4.50e-15 318-333 
IPB000834D 18.95 4.72e-12 199-225 
PR00765D 14.06 9.45e-12 233-246 
PR00765C 10.88 1.82e-10 179-187 
IPB000834F 12.40 4.21e-10 285-297 
IPB000834E 9.80 2.15e-09 228-242 


986 


IPB002871 


NifU-like N terminal domain 


IPB002871C 16.51 1.60e-33 81-113 
IPB002871D 14.11 6.87e-21 131-153 
IPB002871A 14.39 2.17e-17 35-50 
IPB002871B 12.43 6.79e-14 62-74 


990 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 8.29e-ll 94-119 


990 


PR00048 


C2H2-type zinc finger signature II 


PR00048B 5.52 9.50e-09 107-116 


991 
991 


IPB003527 
IPB001245 


MAP kinase 

Tyrosine kinase catalytic domain 


IPB003527D 21.53 5.58e-23 185-226 
IPB003527G 17.26 8.24e-22 285-322 
IPB003527C 14.70 3.05e-19 124-172 
IPB001245A 22.45 5.50e-l7 132-172 



r 
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991 


IPB000959 


POLO box duplicated region 


IPB000959B 15.68 7.19e-17 116-156 
IPR001745R 91 £fi i io<» i<; ioo iin 


991 


IPB001772 


Kinase associated domain 1 


lrouui / /zu zu.oo j.yze-i4 iz/-o/ 


991 


IPB000095 


PAK-box /P21-Rho-bindin2 


irDUUUUSOi^ 1 j. JO /.yie-IJ 40-oZ 
IPB003527A 17.00 6.14e-12 26-51 


991 


IPB000861 


PKN/rhonhil in/rhotelc in rho-hindin tr 
repeat 




991 


IPB000961 


Protein kinase C-terminal domain 


IPB00096ID 21.23 5.9ie-l 1 188-229 
IPB003527B 11.51 9.15e-ll 98-116 


991 


PR00109 


Tyrosine kinase catalytic domain 
signature H 


PR00109B 11.07 9.10e-10 139-157 
IPB000961C 15.48 8.83e-09 139-173 


992 


PR01432 


Rabaptin signature XI 


PR01432K 2.19 8.43e-09 976-998 


994 


IPB001073 


Complement Clq protein 


IPB001073B 20.88 7.26e-29 175-209 


994 


IPB001442 


C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442A 26.12 8.93e-27 75-127 


994 


IPB000885 


Fibrillar collagen C-terminal domain 


IPBO0O885B 19.15 2.83e-26 74-127 
IPB000885B 19.15 7.37e-23 80-133 
IPB001442A 26.12 7.39e-23 72-124 
IPB000885B 19.15 8.75e-23 77-130 
IPB000885A 11.46 1.79e-21 82-119 
IPB00 1073 A 22.14 2.24e-21 78-112 
IPB000885A 11.46 3.84e-21 79-116 
IPB000885A 11.46 5.11e-21 76-113 
IPB000885B 19.15 5.89e-21 71-124 
IPB000885B 19.15 7.56e-21 68-121 
IPB001442A 26.12 8.15e-21 66-118 
IPB001442A 26.12 8.40e-21 69-121 
IPB000885B 19.15 2.97e-20 62-115 
IPB001442A 26.12 3.72e-20 78-130 
1PB000885A 1 1.46 4.00e-20 70-107 
IPB001442A 26. 12 5.62e-20 63-1 15 


994 


PR00007 


Complement C1Q domain signature I 


PR00007A 20.64 6.54e-20 168-194 
IPB000885A 11.46 8.20e-20 73-110 
IPB001442A 26. 12 9.64e-20 84-136 
IPB001442A 26. 12 3.69e-l9 87-139 
IPB001442A 26.12 5.09e-19 60-112 
IPB001442A 26.12 7.43e-19 81-133 
IPB000885B 19.15 3.81e-18 83 


994 


IPB000817 


Prion protein 


IPB000817A 8.34 9.51e-10 76-118 \ 
IPB001442B 12.38 1.00e-09 106-126 
IPB000885A 1 1.46 4.l2e-09 58-95 
IPB001442B 12.38 5.01e-09 97-117 
IPB000817A 8.34 6.12e-09 77-1 19 
1PBUU1442B 12.38 7.32e-09 73-93 
IPB000885A 11.46 7.34e-09 106-143 
IPB001442B 12.38 7.93e-09 70-90 
1PBQQ0885A 1 1.46 8. l6e-09 55-92 
IPB000885B 19 15 8 77e-09 101-154 
IPB000817A 8.34 9.43e-09 65-107 I 


996 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 4.24e-10 311-348 


997 


IPB001895 


Guanine-nucleotide dissociation 
stimulators CDC25 family 


IPB001895C 20.83 7.84e-30 1077-1112 
IPB001895D 18.68 1.00e-20 1174-1197 


997 


IPB001331 


Guanine-nucleotide dissociation 
stimulators CDC24 family 


IPB001331C 16.09 1.00e-18 377-402 
IPB001895B 16.80 3.10e-l5 1005-1025 
IPB001331B 19.33 7.00e-09 326-341 


999 


IPB002360 


Involucrin 


IPB002360C 15.36 3.70e-09 198-239 
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QQQ 

yyy 


TPRfiflfM ^ 
IrDUUUi JJ 


High mobility group proteins HMG1 
ana hmuz 


IPB000135D 2.13 3.91e-09 202-226 


999 


PR00169 


Potassium channel signature I 


PR00169A 17.48 5.50e-09 68-87 


QQQ 

yyy 


PDA1 AQ-3 


Lymphocyte-specific protein 
signature I 


PR01083A 8.60 9.61e-09 214-237 


1001 


IPB000492 


Protamine 2 (PRM2) 


IPB000492B 5.26 5.11e-09 788-822 


1001 


IPB000221 


Protamine PI 


IPB000221 5.48 7.46e-09 945-971 
IPB000221 5.48 8.85e-09 831-857 


1002 


IPB003403 


Herpesvirus immediate early protein 


IPB003403E 17.25 6.47e-10 52-79 


1002 


IPB001841 


RING finger 


IPB001841 10.69 3.84e-09 126-135 


1002- 


IPB000492 


Protamine 2 (PRM2) 


IPB000492B 5.26 5.11e-09 997-1031 


1002 


IPB000221 


Protamine PI 


IPB000221 5.48 7.46e-09 1154-1180 
IPB000221 5.48 8.85e-09 1040-1066 


1003 


PR00320 


G protein beta WD-40 repeat 
signature I 


PR00320A 13.15 4.32e-12 1132-1146 
PR00320C 12.32 3.14e-ll 1132-1146 
PR00320B 12.82 7.55e-ll 1132-1146 
PR00320A 13.15 8.92e-10 1091-1105 
PR00320C 12.32 1.33e-09 1091-1105 


1003 


IPB001680 


G-protein beta WD-40 repeats 


IPB001680 10.43 1.45e-09 1134-1145 
PR00320B 12.82 2.24e-09 1091-1105 
PR00320A 13.15 4.86e-09 789-803 


1003 


PR01472 


Intercellular adhesion 
molecule/vascular cell adhesion 
molecule- 1 signature I 


PR01472A 16.78 9. 82e-09 1 154-1170 


1004 


rPB000433 


ZZ Zinc finger 


IPB000433 14.10 8.20e-18 21-37 


1004 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 7.86e-10 80-105 


1006 


IPB000008 


C2 domain 


IPB000008C 23.37 8.91e-26 323-362 
IPB000008D 14.83 1. 23 e- 12 378-396 
IPB000008B 17.91 3. 09e-09 281-298 
IPB000008E 14.84 3.90e-09 401-41 1 


1007 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D 2.13 5.91e-ll 877-901 
IPB000135D 2.13 7.44e-ll 885-909 
IPB000135D 2.13 7.85e-ll 887-911 
IPB000135D 2.13 3.05e-10 883-907 
IPB000135D 2.13 5,1 le-10 881-905 
IPB000135D 2.13 8.14e-10 888-912 
IPB000135D 2.13 2.27e-09 876-900 
IPB000135D 2.13 2.27e-09 882-906 
IPB000135D 2.13 2.36e-09 880-904 


1007 


PR00806 


Vinculin signature IV 


PR00806D 1 1.95 3.78e-09 564-579 
IPB000135D 2.13 3.91e-09 874-898 
IPB000135D 2.13 4.45e-09 889-913 
IPB000135D 2.13 6.36e-09 884-908 
IPB000135D 2.13 7.00e-09 879-903 
IPB000135D 2.13 7.18e-09 886-910 
IPB000135D 2.13 9.27e-09 920-944 


1008 


IPB000135 


High mobility group proteins HMG1 
ana HMO/ 


IPB000135D 2.13 8.85e-21 560-584 
IPB000135D 2.13 2.47e-19 559-583 
IPB000135D 2.13 7.87e-19 561-585 
IPB000135D 2.13 8.53e-19 563-587 
IPB000135D 2.13 9.35e-19 558-582 
IPB000135D 2.13 7.25e-18 564-588 
IPB000135D 2.13 7.43e-17 55 


1008 


IPB003403 


Herpesvirus immediate early protein 


IPB003403E 17.25 6.8 le-10 560-587 


1008 


IPB003874 


CDC45-like protein 


IPB003874C 5.49 1.24e-09 571-582 


1008 


IPB001990 


Granins (chromogranin or 
secretogranin) 


IPB001990C 33.59 3.49e-09 538-585 
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1008 


IPB000637 


HMG-I and HMG-Y DNA-binding 
domain (A+T-hook) 


IPB000637B 14.21 5.64e-09 568-586 
IPB000135D 2.13 6.09e-09 545-569 


1008 


IPB001580 


Calreticulin family 


IPB001580F2 93 9 1 Op-HQ 


1009 


PR00405 


HIV Rev interacting protein 
signature II 


PR00405B 10.10 2.93e-17 281-298 
PR00405A 18.83 3.86e-14 262-281 


1009 


PR00452 


SH3 domain signature II 


PR00452B 1 1.47 9.70e-10 895-910 
PR00405C 18.05 3.95e-09 302-323 


1009 


IPB003134 


Repeat in HSl/Cortactin 


IPB003134H 12.06 4.27e-09 880-929 


1009 


PR00910 


Luteovirus ORF6 protein signature I 


PRflftQlfiA O 1A Q *7i« r\a 11c *i ah 
r swjuy L\jj\ z. /4 o./ le-Uy 3JD-347 


1011 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 9. 14e-12 511-548 

rPROfnoofiR ?n ?^ 1 nn*» n 010 qcc 
uDuujwuD zu.zo i.uue-n Olo-JjJ 


1011 


PR01536 


Interleukin-1 receptor type I and type 
II family signature III 


PR01536C 19.92 9.23e-l 1 617-640 
IPB003006B 20.23 6.40e-10 124-161 
IPB003006B 20.23 9.64e-10 610-647 
irDvvjKjyjOD zu.zo /.joe-uy zl-5o 
IPB003006B 20.23 8. 62e-09 416-453 
PR01536C 19.92 9.19e-09 225-248 


1015 


IPB002048 


EF-hand family 


IPB002048 7.91 2.29e-li 147-159 


1015 


PR00450 


Recoverin family signature III 


PR00450C 1 1.99 1.58e-09 33-54 
IPB002048 7.91 8.58e-09 74-86 


1016 


IPB003846 


Uncharacterized protein family 
UPF0061 


IPB003846E 18.41 l.OOe-40 136-174 
iruuL»jo4or z4,o7 y.36e-31 175-210 
IPB003846D 28.31 1.61e-17 52-94 
irDUL»Jo*fovj j.uye-Oy zoo-278 


1017 


1PB003846 


Uncharacterized protein family 
UPF0061 


IPB003846C 15.01 l.OOe-40 176-219 
irxJUUJ5*fOxl lo.4i l.OUe-40 46o-50o 
IPB003846F 24.67 9.36e-31 507-542 
IPB003846D 28,31 7.86e-25 235-277 
IPB003846B 13.03 2.00e- 11 148-159 
IPB003846A5.99 3.25e-ll 140-146 
IPB003846G 13.31 5.09e-09 600-610 


1017 


PRO 1548 


Meiotic recombination protein 
reel 14 signature I 


rivviJHO/\ iU.ll O.Dze-tiy z3o-zDo 


1018 


PR00237 


Rhodopsin-like GPCR superfamily 
signature V 


PR00237E 13.03 3.12e-16 236-259 


1018 


PR00238 


Opsin signature II 


PR00238B 16.77 4:52e-14 208-220 
r KUUZ3 id y. /o /.yze-i4 186-207 
PR00237B 12.45 1.39e-13 105-126 

rxwuzj/r l^f.j*!- 1 .0 /e- 1 3 zy4-3 1 o 

PR00237C 14.77 2.00e-13 150-172 
PR00237G 19.23 4.00e-13 332-358 


1018 


IPB000276 


Rhodopsin-hke GPCR superfamily 


IPB000276B 4.97 6.62e-13 244-255 
PR00237A 9.81 7.00e-12 72-96 
IPB000276A 11.56 5.24e-ll 164-175 
IPB000276D 9.40 4.52e-10 342-358 
rKuuzjoA 12.47 o.65e-09 93-105 


1018 


PR00667 


Retinal pigment epithelium-retinal 
GPCR signature II 


PR00667B 10.86 8.80e-09 91-106 


1019 


PR00019 


Leucine-rich repeat signature I 


PR00019A 11.72 2.80e-13 378-391 
PR00019B 11.42 2.33e-10 131-144 
PR00019B 11.42 6.33e-l 0 375-388 
PR00019B 1 1.42 3.73e-09 225-238 
PR00019B 1 1.42 4.00e-09 249-262 
PR00019A 1 1.72 4.55e-09 252-265 
PR00019A 11.72 8.09e-09 134-147 


1021 


IPB001895 


Guanine-nucleotide dissociation 


IPB001895C 20.83 3.00e~28 984-1019 
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stimulators CDC25 family 


IPB001895D 18.68 8.56e-17 1082-1105 
1PB001895B 16.80 4.30e-15 913-933 


1021 


TPROOOSQS 


cyclic nucieotiae-Dinaing domain 


1FB000595B 15.72 6.40e-ll 355-378 


1021 


11 UVUJJJ l 


uisneveueu specuic domain 


li»B003351F 12.17 4.43e-10 615-641 


1021 


IPB001478 


PDZ domain (also known as DHR or 

VJIAJT ) 


IPB001478B 6.12 3.25e-09 625-634 


1021 


PR00834 


HtrA/DegQ protease family signature 

VT 
VI 


PR00834F 1 1.1 1 6.03e-09 621-633 


102? 




Guanine-nucleotide dissociation 
surnuj aiors cuczo iamny 


1PB001895C 20.83 3.00e-28 934-969 
IPB001895D 18.68 8.56e-17 1032-1055 
IPB001895B 16.80 4.30e-15 863-883 


1022 


IPB000595 


Cyclic nucleotide-binding domain 


IPB000595B 15.72 6.40e-ll 305-328 


1022 


IPB003351 


Dishevelled specific domain 


IPB003351F 12.17 4.43e-10 565-591 


1022 


IPB00i478 


PDZ domain (also known as DHR or 
LriAjr ) 


EPB001478B 6.12 3.25e-09 575-584 


1022 


PR00834 


HtrA/DegQ protease family signature 
VI 


PR00834F 11.11 6.03e-09 571-583 


1024 


PR00907 


Thrombomodulin signature VIII 


PR00907H 1.34 7.64e-09 376-400 


1025 


PR00907 


Thrombomodulin signature VIII 


PR00907H 1.34 7.64e-09 338-362 


1027 


IPB003452 


Stem cell factor 


IPB003452A 12.58 1.00e-40 1-41 
IPB003452D 16.80 1.00e-40 173-211 
IPB003452C 13.68 6.76e-37 131-164 
IPB003452B 19.11 2.09e-18 53-101 
IPB003452B 19.11 8.06e- 17 43-91 


1028 


PR00205 


Cadherin signature II 


PR00205B 20.09 1.00e-19 150-179 
PR00205D 12.22 9.3 le-19 238-257 
PR00205F 19.57 3.37e- 17 316-342 
PR00205B 20.09 6.67e-16 374-403 
PR00205B 20.09 2.20e-15 259-288 
PR00205A 17.38 6.82e-14 90-109 
PR00205F 19.57 1.00e-l 3 97-123 
PR00205F 19.57 6.70e-13 427-453 


1028 


IPB002126 


Cadherin domain 


IPB002126A 14.68 9.40e-13 101-117 
IPB002126B 12.04 1.75e- 12 247-264 
PR00205G 13.05 4.30e-12 241-258 
PR00205G 13.05 4.65e-ll 499-516 
IPB002126B 12.04 1.29e-10 138-155 
PR00205E 10.82 2.17e-10 372-385 
PR00205E 10.82 3.35e-10 257-270 
IPB002126A 14.68 6.09e-10 431-447 
PR00205D 12.22 6.55 e- 10 496-515 
PR00205A 17.38 3. 12e-09 420-439 
PR00205D 12.22 5.33e-09 129-148 


1029 


PR00205 


Cadherin signature II 


PR00205B 20.09 1.00e-19 150-179 
PR00205D 12.22 9.3 le-19 238-257 
PR00205F 19.57 3.37e-l 7 316-342 
PR00205B 20.09 6.67e-16 374-403 
rKUUZloB 2.20e-15 259-288 
PR00205A 17.38 6.82e-14 90-109 
PR00205F 19.57 1.00e-13 97-123 
PR00205F 19.57 6.70e-13 427-453 


1029 


IPB002126 


Cadherin domain 


IPB002126A 14.68 9.40e-13 101-117 
IPB002126B 12.04 1.75e- 12 247-264 
PR00205G 13.05 4. 30e- 12 241-258 
PR00205G 13.05 4.65e-ll 461-478 
IPB002126B 12.04 1.29e-10 138-155 
PR00205E 10.82 2.17e-10 372-385 
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PR00205E 10.82 3.35e- 10 257-270 
IPB002126A 14 68 6 OQe-1 0 411-447 
PR00205D 12.22 6.55e-10 458-477 
PR00205A 17.38 3.12e-09 420-439 
PR00205D 12 22 5 33e~09 129-148 


1030 


PR00124 


ATP synthase C subunit signature I 


PR00124A 8.69 9.33e-10 41-60 


1030 


PR01131 


Connexin36 (Cx36) signature II 


PR01131B 3.45 3.17e-09 58-70 
PR00124A 8.69 6.70e-09 43-62 


1030 


IPB003836 


Glucokinase 


TPRflOlSlfiD 91 17 7 <»Q<»_nQ AQ 91 


1030 


PR01516 


Kv4.1 voltage-gated K+ channel 
signature VII 


PR01516G 4.80 8.98e-09 79-90 


1031 


IPB000180 


Renal dipeptidase 


IPB000180B 21.72 7,92e-34 242-281 
IPB000180A 30.29 1.00e-33 172-215 
IPB000180C 22.01 5.67e-27 287-321 


1032 


IPB002027 


Amino acid permease 


IPB002027D 22.00 4.13e-25 325-364 
IPB002027C 19.67 2.74e-22 244-282 
IPB002027A 18.88 3.77e-16 47-75 
IPB002027B 12.67 7.97e-12 180-199 


1033 


IPB000559 


Formate-tetrahydrofolate Hgase 


IPB000559C 13.05 1.00e-40 453-502 
irbUUUjjyr 12./0 1. 0ue-40 653-703 
IPB000559G 15.54 1.00e-40 707-755 
IPB000559D 22.27 4.33e-37 554-594 

rpDAAAtf CQU 1*7 f\Q *7 Kin 1CL Cftf 

irtJuuujDyiv. 10.// o.yoe-jD yjj-yoo 
IPB000559B 12.60 2. 88e-32 413-441 

TPROflO^SQT 17 95 ^ Q4p 17 QAfl q-io 
IPB000559H 20 11 2 72e-96 770-81(1 
IPB000559A 24.17 6.1 le-25 368-412 
IPB000559I 15 05 6 15e-18 856-880 


1033 


PR00085 


Tetrahydrofolate 
dehydrogenase/cyclohydrolase 
family signature III 


PR00085C 13.81 5.70e-14 169-190 
PR00085B 16 65 1 21e-09 116-161 


1034 


IPB000560 


Histidine acid phosphatase 


IPB000560 17.02 l.OOe-ll 378-400 


1035 


IPB001331 


Guanine-nucleotide dissociation 
stimulators CDC24 family 


IPB001331C 16.09 2.40e-12 911-936 


1035 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 4.8 le-09 1125-1139 


1035 


PR00834 


HtrA/DegQ protease family signature 
VI 


PR00834F 11.11 5.24e-09 82-94 
PR00049D 0.00 5.73e-09 147-161 


1035 


1PB001478 


PDZ domain (also known as DHR or 
GLGF) 


IPB001478B 6 12 7 19e-09 86-95 


1035 


EPB002532 


Hantavirus glycoprotein G2 


IPB002532J 16.97 8.37e-09 936-972 


1035 


PR00554 * 


Adenosine A2B receiptor siffnatiirt* II 


PR00554R 19 S9 8 8*ip-flQ 774-719 


1037 


PR00390 


Phospholipase C signature I 


PR001Q0A 14 94 6 14p-90 90S-111 


1037 


IPB002048 


EF-hand family 


IPB002048 7.91 3.84e-09 147-159 


1039 


PR00245 


Olfactorv recentor ^ipnatiirf* TTT 


PRfl094 l 5f l 1 4 65 5 7£p 17 175 1 Q1 

PR00245E 8.96 2.73e-13 282-293 

PR0094SR 11 71 1 10p-19 198 14H 

PR00245D 934 9.33e-ll 235-244 


1039 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11.56 1.47e-10 117-128 
PR00245A 10.98 8.80e-10 91-102 
IPB000276D 9.40 9.61e-l0 281-297 


1039 


PR00896 


Vasopressin receptor signature II 


PR00896B 9.36 5.50e-09 54-65 


1039 


PR00534 


Melanocortin receptor family 
signature I 


PR00534A 12.77 5.70e-09 50-62 


1039 


PR00237 | 


Rhodopsin-like GPCR superfamily 
signature II 


PR00237B 12.45 7.16e-09 58-79 
PR00237E 13.03 8.20e-09 198-221 


1039 


IPB003211 


AmiS/Urel family transporter 


IPB003211A 15.05 9.43e-09 27-66 
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1040 


IPB003367 


Thrombospondin type 3 repeat 


IPB003367C 20.73 l.OOe-40 428-478 
IPB003367D 18.41 l.OOe-40 479-521 
IPB003367E 16.82 l.OOe-40 522-569 
irJ3UUjjo/r lo.zl l.OOe-40 580-629 
IPB003367G 17.08 l.OOe-40 630-671 
IPB003367H 15.25 l.OOe-40 672-704 
IPB003367J 18.60 1.00 


1040 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881B 12.28 4.79e-ll 303-314 
lravv33o7E 16.82 5.67e-ll 404-451 
lFi5UUJjo7C> 20.73 5.96e-ll 510-560 
IPB003367E 16.82 6.83e-ll 425-472 
IPB003367C 20.73 2.38e-10 588-638 
IPB003367C 20.73 6.35e-10 548-598 


1040 


IPB003129 


domains 


irD\j\JoiZyi5 Zo.JU /.ooe- 10 .5,3-5 o 

TPFinn*3'2/c*7/"'' on hi q ac*. in agi cm 
lrDKAJDOO /K^ JA). 15 o.4oe-l0 4M-JU1 

IPROO^^fiTP 16 S9 8 in ^An_An7 

uDWjjO/D lO.CZ O.OOe-lU DOv-Ov/ 

IPB003367C 20.73 6.20e-09 392-442 


1040 


IPB001774 


Delta serrate ligand 


IPB001774D 19 9^ 0 Q1p-DO 996-979 


1042 


IPB000109 


PTR peptide transporters (PTR2) 


IPB00010QD 95 HO A 67<»-'*9 zl^n^L77 

IPB000109B 29 23 4 1 8e-21 67-1 1 Q 
IPB000109A 10 85 3 79e-15 44-62 
IPB000109C 8.21 7.00e-14 195-207 


1042 


PR00308 


Type I antifreeze protein signature III 


PR00308C 2.79 2.78e-09 20-29 


1042 


PRO 1471 


Histamine H3 receptor signature II 


PR01471B 12.38 9.63e-09 24-42 


1043 


IPB003104 


Formin Homology 2 Domain 


IPB003104B 18.83 6.87e-21 785-814 
IPB003104C 20.33 1.27e-14 957-984 


1043 


IPB001073 


Complement Clq protein 


IPB001073A 22.14 3.25e-09 545-579 | 


1043 


IPB001359 


Synapsin 


IPB001359H 22.58 7.99e-09 553-603 


1043 


PR01471 


Histamine H3 receptor signature V 


PR01471E5.41 8.14e?09 543-558 


1044 


IPB001909 


KRABbox 


IPB001909 17.37 6.32e-28 10-44 


1044 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 9.10e-22 592-617 
IPB000822 14.67 9.18e-21 228-253 
IPB000822 14.67 5.50e-19 452-477 
IPB000822 14.67 6.25e-19 284-309 
IPB000822 14.67 7.23e-18 368-393 
IPB000822 14.67 9.31e-18 144-169 
IPB000822 14.67 2.29e-17 536-561 

tddaaaooo i a c~i o n /on ct\e 

IriJUUUozz 14.0 / o.07e-17 480-505 
IPB000822 14.67 9.36e-17 256-281 
IPB000822 14.67 2.20e-16 340-365 
IPB000822 14.67 5.20e-16 172-197 
IPB000822 14.67 5.20e-16 200-225 
IPB000822 14.67 5.80e-16 564-589 
lrBUOUoZZ 14.0 / o.20e-16 396-421 
IPB000822 14.67 8.80e-16 424-449 
IPB000822 14.67 3.25e-15 508-533 
IPB000822 14.67 4 94e-15 620-645 


1044 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 5.50e-15 589-602 
PR00048A 9.94 6.40e-15 253-266 
IPB000822 14.67 1.00e-14 312-337 
PR00048A 9.94 5.15e-14 533-546 
PR00048A 9.94 6.79e-13 393-406 
IPB000822 14.67 7.50e-13 116-141 


1044 


IPB001275 


DM DNA binding domain 


IPB001275 19.17 9.86e-13 580-619 
PR00048A 9.94 1.53e- 12 477-490 
PR00048A 9.94 5.24e-12 561-574 ! 
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PR00048A 9.94 5.76e-12 225-238 

TPR00197S 1Q 17 o tz/ia, 10 O/f/1 no? 

irr>uuiz/j ly.if o.ooe-iZ Z44-Z83 
PR00048A 0 04. Q 47*» 19 081 90/1 

PR00048A9.94 1.00e-ll 141-154 


1044 


IPB001222 


TFTTS 7inp rihhon Hnmain 


TPROO 1 999 94 61 S £Q*» no 1 1 £ 1 <9 
irjjvuizz.^ ^t.Oj O.Oi/e-uy 1 10~ljZ 

PR00048R S S9 7 00*» ftQ 4Q7 S09 
PR00048A 0 04 7 11t> OQ 491.474 

PR00048A 9.94 9.25e-09 449-462 
IPB001222 24 63 0 40e-0Q 144-180 


1044 


IPB002801 


Aspartate carbamoyltransferase 
regulatory chain 


IPB002801C 14.18 9.50e-09 254-270 
PR00048B 5 52 9 50e-00 381-^00 


1046 


IPB003137 


Protease associated (PA) domain 


IPB003 137 22 40 2 50e-1 9 1 88-9 1 R 


1048 


IPB001627 


Sema domain 


IPB001627J 1 1 43 9 40e-1 1 407 410 
IPB001627K 13.76 6.58e-ll 477-489 


1048 


IPB002165 


Plexin repeat 


IPB002165D 14 72 7 91e-1 1 477-480 


1049 


IPB000243 


Proteasome B-type subunit 


IPB000243C 13.61 8.80e-09 52-62 


1049 


PR00766 


Amiloride-sensitive amine oxidase 
signature VII 


PR00766G 10 8S 0 97p-OQ 01 111 


1050 


IPB001140 


ABC transnorter transmemhranp 
region 


TPR001140R IS 69 4 0S*»-14 178 17£ 


1051 


IPB000433 


ZZ Zinc finger 


IPB000433 14.10 8.20e-18 21-37 


1051 


IPB000822 


"Zinc fmger, C2H2 type" 


IPB000822 14.67 7.86e-10 80-105 


1052 


IPB000353 


"Class II histocompatibility antigen, 

beta chain heta-1 domain" 


IPB000353B 19,16 9.22e-14 133-182 


1052 


IPB003006 


Immunoglobulin and major 

hi^fnpnmnatihilitv r»nmnlpY Anmain 


IPB003006B 20.23 4.43e-12 86-123 
tproo700^a 17 <i 4 nn*» 11 is/i 17/; 


1052 


IPB001003 


"MHC Class II, alpha chain, alpha- 1 
domain" 


IPB001003B 14.72 5.40e-10 141-184 


1053 


PR00018 


Krinffle domain signature T 


PR0001 8A 19 97 4 IOp 00 76 SI 


1055 


IPB001039 


"Major histocompatibility complex 
Drotein Cla^s I" 


IPB001039A 17.17 1.00e-40 15-68 

IPR001O7QR 97 SS 1 00a_40 Q£_1A7 
IPB001039C 10 82 1 00e-40 177-970 
IPB001039D 16.49 1.00e-40 255-309 


1055 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.00e-30 261-298 
IPB003006A 17 SI 1 00e-91 994-946 


1055 


IPB000353 


"Class II histocompatibility antigen, 
beta chain, beta-1 domain" 


IPB00O353B 19.16 7.65e-l4 203-252 


1055 


IPB003363 


Glycoprotein GG/GX 


IPB003363E 13.35 8.75e-ll 308-340 


1055 


IPB003705 


Cobalt transport protein CbiN 


IPB003705A 9.20 6.25e-09 316-332 
IPB000353C 20.1 1 7.97e-09 254-308 


1062 


PRO 1382 


fManHin-Q cicrnntnrf* T\/ 


PPni789r* 19 7Q 1 11^ 1 £. 7A1 Oil 
rKUiooZiJ IZ.Jo 1. 1 le-lo ZU1-Z13 


1062 


IPB000729 


PMP-22/EMP/MP20 family 


IPB000729D 18.96 2.96e-16 160-187 
fpnAon79or i 7*7 ©7 toi- i/c on iio 

PPfll 789 A 19 AA 1 1 7o 1 <T 7*7 vl*7 
rlvUlJoZA lZ.vU I. i /e-lD j /-4/ 


1062 


PRO 1077 

1 J.VV/ IV// 


f" 1 ! miff in cicm ofn itv» TTT 


PPA1A77/" 1 11 A(\ 1 /t*7« t A HI "11 ~' 
rKUlU//C ij.OU 1.4/e-i4 03-/3 

PP01789P S 67 5 14#> 17 1QA 1 OQ 

PR01382B 7.06 L12e-12 91-100 
PR01077B 14.12 1.00e-10 49-55 
PR01077D 11.20 4.00e-10 146-152 
PR01077A 9.72 8.16e-09 21-30 


1064 


IPB001478 


PDZ domain (also known as DHR or 
GLGF) 


IPB001478B 6.12 5.50e-09 453-462 
IPB001478B 6.12 7.75e-09 258-267 


1066 


IPB002659 


Galactosyltransferase 


IPB002659A 26.24 4.80e-ll 92-133 


1067 


IPB001245 


Tyrosine kinase catalytic domain 


IPB001245A 22.45 7.60e-28 119-159 


1067 


IPB001772 


Kinase associated domain 1 


IPB001772C 20.66 9,25e-24 114-144 


1067 


IPB000961 


Protein kinase C-terminal domain 


IPB000961C 15.48 2.13e-22 126-160 
IPB001772D 21.67 4.55e-17 186-225 
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1067 


IPB000959 


POLO box duplicated region 


IPB000959B 15.68 8.60e-17 103-143 


1067 


IPB000095 


PAK-box /P21-Rho-binding 


IPB000095E 17.62 9.03e-17 127-172 


1067 


IPB003527 


MAP kinase 


IPB003527C 14.70 1.95e-16 111-159 


1067 


IPB000861 


•t jvi^/iiiupiiiiiii/iiiuiciviii rno-Dinciing 
repeat 


irtjuuuooir lo.50 1.55e-15 120-174 


1067 


IPB000494 


jDpiucixuai gruwui-iauiur recepior 
(EGFR), L domain" 


TDDAAA^ (\A O/l A f\ n *y e i a i 1 1 rn 

ii J t>0004iJ4C 24.40 7.35e-14 1 13-159 
IPB000959D 27.01 4.26e-13 226-278 
IPB000961D 21.23 7.19e-13 175-216 
IPB001245B 21.68 8.96e-13 179-217 
IPB003527A 17.00 7.85e-l 1 18-43 
irtJOOl /72b 24.88 8.46e-ll 233-272 
IPB001772A 13.64 2.29e-10 9-40 
IPB003527G 17.26 3.37e-09 245-282 


1067 


PR 00 109 


i yiUoillC tuilaoc Catalytic UOIUain 

signature II 


rKOO IOxd 11.07 4.23e-09 126-144 
IPB003527D 21.53 4.60e-09 172-213 


1068 


PRO 1254 


r iu5La^iciiiuui u bynindse signature i 


rK012;>4A 12.32 3.37e-29 3 1-54 
rKUiZD*HJ ij.oO /.y/e-27 10SM32 
PR01254C 10.60 4.68e-22 74-92 

PR0.1 O^AV 1 A HQ 7 ^fi« 11 1/C9 ion 
rivuizj*fr IU.vo /,Joe-21 102-lo0 

ppfli?^4F i flft*» 18 1A< I^O 
rivuujHc i*f,v/ i.uue-io i*fj-iDy 


1068 


PR00179 


Lipocalin signature II 


PR00179B 7.67 5.26e-13 120-132 

PR0017QP 17 1(\ 1 KzL» 19 148 1£7 
PR01254B 12 05 9 04e-17 57-67 


1068 


PR01275 


Neutrophil gelatinase lipocalin 
signature V 


PR01275E 6.38 1.72e-10 115-133 
PR00179A 13.97 3.25e-10 37-49 


1068 


PR01215 


Alpha- 1-microglobulin signature IV 


PR01215D 12.88 9.78e-10 1 11-130 


1068 


IPB000566 


Linocalin and cvtosolic fattv-ariH 
binding protein 


TPRnnn^^R 8 oi i /i9v> no ion im 


1068 


PR0U74 


Retinol binding orotein signature VF 


PR01 174F 1 1 If* % Q6f> HQ 1 1Q 1 15 


1068 


PR01273 


Invertebrate colouration protein 
signature IV 


PR01273D 11.48 4.41e-09 120-134 
PR01275B 9.02 8.57e-09 39-49 


1069 


IPB000704 


"Casein kinase II, regulatory subunit" 


IPB000704B 17.35 6.26e-09 90-128 


1070 


IPB001464 


Ann^Yin familv 

x»-1I11CAH1 Lalllliy 


ir £>00 1404D 2j .42 1 .00e-40 2o 1 -335 
IPB001464B 28.31 6.76e-40 151-203 
IPB001464A 31.17 1.27e-35 79-133 
IPB001464C 24.68 6.40e-30 214-253 


1070 


PR00196 


Annexin familv cionatnri* TV 


rtsXnJiyoU 21.41 j.ole-22 21V-245 
PR00196E 9.70 7.75e-21 299-319 


1070 


PR00201 


Annexin type V signature VII 


PR00201G 12.46 1.00e-20 299-325 
PR00196C9.01 7.09e-20 136-157 
IPB001464B 28.31 4.88e-19 79-131 

PPHA1 OA. A 19 A*7 9 A In. 1 O ACS ni 

rKuuiyoA 12.0/ 2.42e-io oy-yl 


1070 


PR00199 


Annexin type III signature VI 


PR00199F 15.67 5.10e-18 219-245 
Il'H0014o4D 25.42 9.21e-18 122-176 
IPB001464B 28.31 3.86e-17 235-287 
IPB001464A 31.17 6.68e-17 151-205 


1070 


PR00200 


/\rinexin type i v signature v u 


PK00200G 9.20 8,4le-l7 299-325 
PR00199D 4.74 2.lle-l6 295-316 
PR00199G 9.85 5.29e-16 300-325 
PR00196C 9.01 5.96e-16 295-316 
PR00199D 4.74 7.04e-16 136-157 


1070 


PR00197 


Annexin type I signature IV 


PR00197D 7.59 7.56e-16 136-157 
PR00196B 11.03 9.31e-16 109-125 


1070 


PR00198 


Annexin type II signature IV 


PR00198D 7.41 9.88e-16 136-157 
PR00200E 8.88 5.88e-15 136-157 
PR00197F 9.40 7.39e-15 299-319 


1070 


PR00202 ! 


Annexin type VI signature VII 


PR00202G 8.03 9.71e-15 299-325 
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IPB001464A 31.17 1.85e-14 235-289 
PR00197D 7.59 1.94e-14 295-316 
PR00196C 9.01 5.02e-14 64-85 
PR00201D 8.61 9.29e-14 136-157 
PR00199D 4.74 2.84e-13 64-85 
PR00198D 7.41 3.15e-13 295-316 PR00 


1071 


IPB000175 


Sodium: neurotransmitter symporter 
family 


IPB000175A 16.29 1.00e-40 52-101 
IPB000175C 15.09 l.OOe-40 212-263 
IPB000175F 25.63 4.50e-38 467-506 
IPB000175E 21,88 5.95e-35 372-411 
IPB000175B 19.12 9.05e-33 139-173 


1071 


PR00176 


Sodium/chloride neurotransmitter 
symporter signature I 


PR00176A 16.97 3.25e-27 52-73 
PR00176C 10.57 7.86e-25 124-150 


1071 


PR01195 


GAT-1 GAB A neurotransmitter 
transporter signature II 

. 


PR01195B 13.58 1.22e-24 194-211 
PR00176G 13.12 3.77e-22 458-478 
PR01 195D 9.00 3.75e-21 583-600 
PR00176E 11.14 5.20e-21 322-342 
PR00176F 11.11 1.36e-19 376-395 
IPB000175G 16.18 5.13e-19 528-550 
PR00176B 7.07 9.63e-19 81-100 
PR01195A 7.44 1.90e-18 18-32 
PR00176D 8.96 6.48e-18 239-256 
PR00176H 15.94 7.63e-18 498-518 
IPB000175D 23.45 1.28e-17 278-330 
PR01195C 15.62 1.14e- 13 348-357 


1fi79 
LU/Z 


IroUUJUUO 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.92e-10 98-135 


1073 


IPB001863 


Glypican 


IPB001863D 26.43 5.62e-33 250-294 
IPB001863E 33.79 3.08e-29 298-350 
IPB001863B 38.78 1.45e-25 134-186 
IrJB00l5o3r 26.99 o.59e-22 429-463 
IPB001863C 20.17 1.37e-16 191-220 
IrrJUUlooiA 13.95 5.03e-15 56-71 
IPB001863G 11.32 4.68e-l 2 487-505 


1073 


PR00436 


Interleukin-8 signature I 


PR00436A 15.20 7.91e-10 1-24 


1073 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 3. 90e-09 515-529 


1073 


IPB001702 


General diffusion Gram-negative 
ponns 


IPB001702D 9.64 1.00e-08 536-546 


1075 


IPB001675 


Glycosyltransferase family 29 


IPB001675A 26.48 5.76e-31 296-340 
IPB001675B 15.84 6.50e-15 434-456 


1075 


PR01329 


Kir3.3 inward rectifier K+ channel 
signature II 


PR01329B 8.30 9.29e-09 7-21 


1078 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599L 18.66 7.84e-26 1244-1271 
IPB001599F 18.95 7.00e-24 785-814 
IPB001599H 18.42 6.40e-20 1019-1046 
IPB001599A 10.97 9.69e-18 123-141 
IPB001599N 24.85 2.24e-14 1437-1469 


1078 


IPB001134 


"Netrin, C-terminus" 


IPB001134C 17.82 4.13e-13 1257-1271 
lrbUl)l:>99M 13.29 4.71e-13 1384-1395 
IPB001599G 13.87 8.94e-13 987-996 
IPB001599B 7.45 4.89e-12 209-221 
IPB001599D 11.61 6.90e-l 2 728-738 
IPB001599J 20.99 3.00e-ll 1085-1110 
IPB001599I 10.83 7.60e-ll 1054-1063 
IPB001599K 8.15 1.46e-10 1214-1225 
IPB001599C 14.40 3.55e-09 236-252 
IPB001599E 1 1.06 9.77e-09 755-764 i 


1079 


EPB001599 


Aipha-2-macroglobulin family 


IPB001599F 18.95 7.00e-24 799-828 
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IPB001599A 10.97 9.69e-18 136-154 
IPB001599B 7.45 4.89e-12 222-234 
IPB001599D 1 1 .61 6.90e-12 742-752 
IPB001599C 14.40 3.55e-09 249-265 
IPB001599E 11.06 9.77e-09 769-778 


1080 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599A 10.97 9.69e-18 123-141 
IPB001599B 7.45 4.89e-12 209-221 
IPB001599C 14.40 3.55e-09 236-252 


1081 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599L 18.66 7.84e-26 1244-1271 
IPB001599F 18.95 7.00e-24 785-814 
IPB001599H 18.42 6.40e-20 1019-1046 
IPB001599N 24.85 7.69e-20 1437-1469 
IPB001599A 10.97 9.69e-18 123-141 


1081 

l I/O 1 




iNetrin, C-terminus 


IPB001134C 17.82 4.13e-13 1257-1271 
IPB001599M 13.29 4.71e-13 1384-1395 
IPB001599G 13.87 8.94e-13 987-996 
IPB001599B 7.45 4.89e-12 209-221 
IPB001599D 11.61 6.90e- 12 728-738 
IPB001599J 20.99 3.00e-ll 1085-1110 
IPB0015991 10.83 7.60e-ll 1054-1063 
IPB001599K 8.15 1.46e-10 1214-1225 
IPBU01599C 14,40 3.55e-09 236-252 

TDD An i cone 1 1 ft ^ e\ n- nr\ "ir r t~ a 

IrtJUU 1599b 11.06 9.77e-09 755-764 


1082 


IPB001599 


Alpha-2-macroglobuIin family 


IPB001599F 18.95 7.00e-24 786-815 
IPB001599A 10.97 9.69e-18 123-141 
IPB001599B 7.45 4.89e-12 209-221 
IPB001599D 11.61 6.90e-12 729-739 
IPB001599C 14.40 3.55e-09 236-252 
IPB001599E 1 1.06 9.77e-09 756-765 


1083 


IPB002018 


Carboxylesterases type-B 


IPB002018 21.41 2.38e-27 195-235 
IrBOOzOlS 21. 41 2.47e-12 504-544 


1083 


PR00878 


v^iiuiiuvoicrdsc signature vi 


Tin AAOTOP A e\C O /\*^ f\f\ A y r\ Arm 

PR00878F4.95 8.07e-09 460-472 


1084 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 1.64e-16 1682-1697 
IPB000152 8.86 L53e-15 1178-1193 
IPB000152 8.86 1.47e-14 1136-1151 
1PB000152 8.86 2.89e-14 1095-1110 
IPB000152 8.86 3.84e-14 932-947 
IPB000152 8.86 4.79e-14 1219-1234 
IPB000152 8.86 5.74e-14 642-657 
IPB000152 8.86 3.05e-13 1054-1069 


1084 


IPB001881 


v^aiuiuiu-uuiuing hvjt-iikc domain 


TDD ft A 1 O O 1 O 11 TO A f\t\ ii « s~ r\rs i **t\r\ 

IPB001881B 12.28 4.00e-13 1682-1693 


1084 


IPB003367 


i iiiuiiiuubpunuin type j repeat 


1PB003367A 1 1.78 7.72e-13 1023-1043 
IPB001881B 12.28 7.75e-13 1095-1106 
IPB000152 8.86 9.18e-13 1261-1276 
IPB001881B 12.28 1.00e-12 642-653 
EPB001881B 12.28 2.20e-12 1483-1494 
IPB000152 8.86 6.40e-12 1483-1498 j 
IPB001881B 12.28 6.40e-12 1178-1189 
IPB0O1881B 12.28 8.20e-12 1261-1272 
IPB001881B 12.28 9.40e-12 1136-1147 


1084 


IPB003886 


Extracellular domain in nidogen 


IPB003886D 13.91 1.00e-ll 1136-1155 


1084 


PR00010 ] 


Type II EGF-like signature III 


PR00010C6.98 i.37e-ll 1687-1697 
IPB001881B 12.28 3.84e-ll 1219-1230 
PR00010C6.98 4.00e-ll 1183-1193 


1084 


IPB000033 


"Low-density lipoprotein (ldl) 
receptor, YWTD repeat" 


IPB000033B 7.05 4.24e-ll 1059-1069 
IPB001881B 12.28 6.68e-ll 932-943 
IPB003886D 13.91 2.92e-l0 1219-1238 


1084 1 


IPB003306 


WIF domain 


IPB003306E 25.51 4.00e-10 176-221 
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1084 


IPB000034 




TPHAAAA^/IA OO 01 A *Zd~ i/\ 1 0*7 oil 

irts\ju\j\JoHJ\ zz.Zi 4.o2e-10 187-222 
IPB001881B 12.28 5.29e-10 1054-1065 
IPB000152 8.86 5.50e-10 1303-1318 
IPB000033B 7.05 5.65e-10 1266-1276 
IPB000033B 7,05 6.23e-10 1100-11 10 
IPB001881B 12.28 6.57e-10 1303-1314 
IPB001881B 12.28 7.43e-10 1014-1025 
IPB000I52 8.86 7.75e-10 890-905 

TPPAAAAI^P t a< q ntZa. in t/zon 1 

irDKjvvvjoD o.Zoe-lU Ioo7-loy7 
PR00010C 6.98 8.55e-10 937-947 


1084 


IPB000006 


"Vertebrate metallothionein, family 
1" 


IPB000006 13.41 8.94e-10 175-220 
IPB003886D 13.91 1.00e-09 1682-1701 
IPB000033B 7.05 1.24e-09 647-657 
IPB000033B 7.05 1.47e-09 1141-1151 
IPB000033B 7.05 1.95e-09 1183-1193 
IPB003306D 23.91 2.18e-09 194-242 
PR00010C 6.98 2.32e-09 647-657 
IPB003886D 13.91 2.52e-09 1178-1197 


1084 


PR00011 


Type III EGF-like signature IV 


PR00011D 12.12 4.21e-09 413-431 
irtJUUJoooL) 13.91 4.32e-09 1095-1114 

IPB001881B 12.28 4.52e-09 890-901 

tpraaaa'itr n c\k a to** no 00*7 ciai 
irD\j\j\j\jojD /.kjj 4. /ye-uy yj /-y4/ 

praaai ap & os a cko aq 1 a<;o 1 a^q 
r ivuv/v/iu^ 0.2/0 *T.yje-i/y lU^y-iuoy 

PR00010C 6.98 5.39e-09 1224-1234 

PR00010C 6 98 6 71e-09 17tffi-1?7fi 
IPB001881B 12 28 6 87e-09 1442-14^ 
IPB000033B 7.05 6.92e-09 1224-1234 
IPB003886D 13.91 7.09e-09 1261-1280 


1084 


IPB002221 


WAP-type (Whey Acidic Protein) 
four-disulflde core domain 


IPB002221B 17.12 7.75e-09 1466-1487 


1084 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 8.02e-09 92-106 


1084 


PR00009 


Type I EGF signature III 


PR00OOQP 11 7A 8 9Ap-AQ 1 ASS 1AAQ 

IPB000152 8.86 8.58e-09 1637-1652 


1084 


IPB002557 


Chitin binding domain 


IPB002557B 12.64 9.31e-09 1453-1466 


1084 


IPB000561 


EGF-like domain 


IPB000561 4.89 9.36e-09 1187-1195 


1084 


IPB002919 


Trypsin Inhibitor-like cysteine rich 
domain 


IPB002919B 21.14 9.51e-09 899-921 
IPB000152 8.86 9.76e-09 1442-1457 
IPB003886D 13.91 9.86e-09 642-661 

TPRAA^fiR^ri 1 1 01 Q CAo no 010 Q<1 

irDuujoooiy u.yi y.ooe-uy yjz-yoi 

fPRAAAS^! 4fiQ 1 AA*» AS AOA_2lOfi 

PR00010C 6.98 1.00e-08 1141-1151 


1086 


PR00014 


Fibronectin type III repeat signature 
IV 


PR00014D 15.12 9.25e-13 571-585 

PRAAA14P 14 47 & fJ\i* 1 1 A^l ££0 

PR00014D 15.12 7.75e-ll 872-886 
PR00014D 15.12 5.74e- 10 443-457 

PR0AA14P 14 47 SA*» 1 A RS/L.870 

PR00014A 8.22 1.00e-08 816-825 
PR00014D 15.12 1. 00e-08 770-784 


1087 


IPB001909 


KRAB box 


IPB001909 17.37 7.75e-31 16-50 


1087 


IPB000822 


"Zinc ringer, C2H2 type" 


IPB000822 14.67 7.55e-21 219-244 
IPB000822 14.67 4.21e-17 191-216 
IPB000822 14.67 8.80e-16 163-188 


1087 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 5.85e-14 188-201 
PR00048A 9.94 9.31e-14 244-257 
PR00048A 9.94 8.41e-12 216-229 


1087 


IPB001275 


DM DNA binding domain 


IPB001275 19.17 5.24e-ll 207-246 
PR00048A 9.94 7.16e-ll 160-173 
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PR00048B 5.52 6.14e-10 232-241 
IPB001275 19.17 7.45e-10 151-190 
IPB001275 19.17 8.06e-09 179-218 


1088 


1PB0019O9 


KRAB box 


IPB001909 17.37 7.75e-31 16-50 


1088 


PR00048 


C2H2-type zinc finger signature I 


PR00048A9.94 6.21e-ll 160-173 


1089 


IPB002494 


"Keratin, high sulfur B2 protein" 


IPB002494C 14.46 8.36e-35 20-63 
IPB002494C 14.46 5.74e-34 89-132 
IPB002494C 14.46 1.44e-30 99-142 
IPB002494C 14.46 7.86e-29 64-107 
IPB002494C 14.46 1.41e-27 74-117 
IPB002494C 14.46 4.71e-25 30-73 
1PB002494C 14.46 6.69e-25 79- 


1089 


IPB000359 


Cystine-knot domain 


IPB000359B 19.26 9.57e-13 24-42 
IPB000359B 19.26 9.57e-13 68-86 
IPB002494C 14.46 9.61e-13 73-116 
IPB002494B 10.58 2.50e-12 51-65 
IPB002494B 10.58 2.50e-12 95-109 
IPB002494C 14.46 4.37e-12 34-77 
IPB002494A 12.44 5.22e-12 91-124 
IPB002494C 14.46 6.06e-12 93-136 
IPB002494C 14.46 7.47e-12 83-126 


1089 


IPB000006 


"Vertebrate metallothionein, family 
1" 


IPB000006 13.41 7.62e-12 66-lil 
IPB002494B 10.58 7.75e-12 65-79 


1089 


IPB001271 


Mammalian defensin 


IPB001271 19.97 7.95e-12 58-86 
IPB002494B 10.58 9.55e-12 120-134 
1PB001271 19.97 9.59e-12 19-47 
IPB002494B 10.58 1.28e-ll 26-40 
IPB002494B 10.58 1.28e-ll 70-84 
IPB002494A 12.44 1.86e-ll 121-154 
IPB002494A 12.44 2.82e-ll 56-89 
IPB001271 19.97 3.06e-ll 103-131 
IPB000006 13.41 4.50e-ll 70-115 
IPB000006 13.41 5.50e-ll 40-85 
IPB002494C 14.46 6.64e-ll 98-141 
IPB002494C 14.46 6.73e-l I 78-121 
IPB000006 13.41 8.20e-ll 65-110 
IPB002494A 12.44 9. 14e-ll 57-90 
IPB001271 19.97 1.88e-10 28-56 
IPB001271 19.97 1.88e- 10 72-100 
IPB002494C 14.46 2.14e-l0 14-57 
IPB002494B 10.58 2.48e-10 56-70 
IPB000006 13.41 2.65e-10 61-106 
IPB001271 19.97 2.94e-10 67-95 
IPB001271 19.97 3.12e-10 18-46 
IPB000006 13.41 3.42e-10 22-67 
IPB002494B 10.58 4.22e-10 110-124 


1089 


IPB001762 


Disintegrin 


IPB001762A 23.93 4.26e-10 39-79 
IPB002494A 12.44 4.27e-10 46-79 
IPB000006 13.41 4.29e-10 21-66 
IPB001762A 23.93 4.45e-l0 44-84 
IPB001271 19.97 5.41e-10 117-145 
IPB000006 13.41 6.23e-10 91-136 
IPB001271 19.97 6.47e-10 123-151 
IPB000006 13.41 6.61e-10 26-71 
IPB002494B 10.58 6.64e-10 31-45 
IPB002494B 10.58 6.64e-10 75-89 
IPB002494B 10.58 6.91e-10 41-55 
IPB002494B 10.58 6.91e-10 85-99 
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IPB002494C 14.46 7.64e-10 108-151 
1PB002494A 12.44 7.65e-10 67-100 
IPB002494B 10.58 7.72e-10 100-1 14 
IPB002494A 12.44 8.06e-10 82-115 
IPB002494C 14.46 8.25e-10 19-62 


1089 


IPB000967 


Zinc finger NF-X1 type 


IPB000967E 21.88 8.67e-10 51-91 
IPB000359B 19.26 8.76e-10 59-77 
IPB001271 19.97 8.76e-10 88-116 
IPB000006 13.41 9.03e-10 114-159 
IPB001762A 23.93 9.04e-10 45-85 
IPB001762A 23.93 9.04e-10 94-134 
IPB002494C 14.46 9.48e-10 4-47 


1089 


IPB001169 


"Integrin beta, C-terminus" 


IPB00 1 169K 27.45 4.89e-09 86- 128 
IPB001271 19.97 4.93e-09 29-57 
IPB001271 19,97 4.93e-09 73-101 
IPB001271 19.97 4.93e-09 97-125 
IPB001271 19.97 4.93e-09 102-130 
IPB002494C 14.46 4.95e-09 65-108 
rPB000006 13.41 5.22e-09 81-126 


1090 


IPB002494 


"Keratin, high sulfur B2 protein" 


1PB002494C 14.46 9.43e-29 24-67 
IPB002494C 14.46 3.22e-22 14-57 
IPB002494C 14.46 8.08e-21 29-72 
IPB002494C 14.46 7.99e-20 19-62 
IPB002494A 12.44 3.29e-19 31-64 
IPB002494C 14.46 8.65e-18 9-52 
IPB002494A 12.44 8.15e-17 21-54 
1PB002494A 12.44 7. 17e-l6 36-69 
IPB002494A 12.44 6.12e-15 2-35 
IPB002494A 12.44 4.96e-14 26-59 
IPB002494C 14.46 2.86e-13 5-48 
IPB002494C 14.46 4.72e-13 28-71 
IPB002494C 14.46 5.30e-13 4-47 
IPB002494A 12.44 6.19e-13 12-45 
IPB002494A 12.44 6.54e-13 41-74 
IPB002494A 12.44 8.15e-13 1-34 
1PB002494C 14.46 9.51e-13 20-63 


1090 


1PB000359 


Cystine-knot domain 


1PB000359B 19.26 9.57e-13 28-46 


1090 


IPB000006 


"Vertebrate metallothionein, family 
1" 


IPB000006 13.41 4.21e-12 26-71 


1090 


1PB001271 


Mammalian defensin 


IPB001271 19.97 7.75e-12 18-46 
IPB002494A 12.44 l.lle-11 11-44 
IPB002494B 10.58 1.28e-ll 30-44 
IPB002494A 12.44 6.25e-l 1 16-49 ! 
IPB002494C 14.46 8.27e-ll 15-58 
IPB002494A 12.44 8.39e-ll 6-39 
IPB002494C 14.46 9.82e-ll 10-53 


1090 


IPB001762 


Disintegrin 


IPB001762A 23.93 9.65e-09 34-74 
IPB002494A 12.44 9.90e-09 27-60 
IPB000006 13.41 1.00e-08 25-70 


1091 


IPB002494 


"Keratin, high sulfur B2 protein" 


IPB002494C 14.46 8.36e-35 20-63 
IPB002494C 14.46 7.86e-32 124-167 
IPB002494C 14.46 6.55e-31 64-107 
IPB002494C 14.46 8.95e-31 89-132 
IPB002494C 14.46 1.44e-30 134-177 
IPB002494C 14.46 4.23e-28 99-142 
IPB002494C 14.46 9.46e-26 


1091 


IPB000359 


Cystine-knot domain 


IPB000359B 19.26 9.57e-13 24-42 
IPB000359B 19.26 9.57e-13 68-86 
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IPB002494A 12.44 i.56e-12 42-75 
IPB002494B 10.58 2.50e-12 51-65 
IPB002494B 10.58 2.50e-12 95-109 
IPB002494B 10.58 2.50e-12 130^144 
IPB002494C 14.46 5.41e-12 34-77 
IPB002494C 14.46 6.06e-12 128-171 
IPB002494C 14.46 7.28e-12 118-161 


1091 


IPB001271 


Mammalian defensin 


IPB001271 19.97 7.95e-12 58-86 
IPB002494C 14.46 9.25e-12 103-146 
IPB002494B 10.58 9.55e-12 155-169 
IPB001271 19.97 9.59e-12 19-47 
IPB002494B 10.58 1.28e-ll 26-40 
IPB002494B 10.58 1.28e-ll 70-84 
IPB002494A 12.44 l.86e-ll 156-189 
IPB001271 19.97 3.06e-ll 138-166 
IPB002494A 12.44 4.00e-ll 56-89 


1091 


IPB000006 


"Vertebrate metallothionein, family 
1" 


IPB000006 13.41 4.10e-ll 66-111 
IPB002494C 14.46 4.91e-ll 113-156 
IPB001271 19.97 5.13e-ll 97-125 
IPB002494C 14.46 6.64e-ll 133-176 
IPB000006 13.41 6.80e-ll 40-85 
IPB000359B 19.26 7.48e-ll 103-121 
IPB002494C 14.46 7.91e-ll 98 


1091 


IPB001762 


Disintegrin 


IPB001762A 23.93 9.04e-10 129-169 
IPB002494C 14.46 9.21e-10 65-108 
IPB000006 13.41 9.42e- 10 95-140 
IPB002494C 14.46 9.48e-10 4-47 
IPB000359B 19.26 9.69e-10 158-176 
IPB000359B 19.26 1.28e-09 153-171 
IPB000006 13.41 1.55e-09 115-160 


1091 


IPB000967 


Zinc finger NF-X1 type 


IPB000967E 21.88 1.56e-09 51-91 
IPB002494A 12.44 1.58e-09 147-180 
IPBOO 1762 A 23.93 1.88e-09 39-79 
IPB001271 19.97 2.15e-09 98-126 1 
IPB002494A 12.44 2.55e-09 62-95 
IPB002494A 12.44 3.13e-09 41-74 , 
IPB002494A 12.44 3.23e-09 28-61 
IPB002494A 12.44 3.23e-09 72-105 
IPB002494A 12.44 3.23e-09 77-110 
IPB002494B 10.58 3.41e-09 16-30 
IPB001271 19.97 3.78e-09 23-51 
IPB001271 19.97 3.78e-09 67-95 


1091 


IPB001169 


"Integrin beta, C-terminus" 


IPB001 169K 27.45 3.92e-09 121-163 
IPB000006 13.41 3.94e-09 80-125 
IPB000006 13.41 4.03e-09 140-185 
IPB001762A 23.93 4.18e-09 44-84 
IPB002494B 10.58 4.42e-09 125-139 
IPB002494A 12.44 4.48e-09 33-66 
IPB000006 13.41 4.86e-09 65- 


1092 


IPB000734 


Lipase 


IPB000734 10.25 8.12e-09 164-178 


1093 


IPB000734 


Lipase 


IPB000734 10.25 8.12e-09 224-238 


1094 


PR01223 


Bride of sevenless protein signature 
VI 


PR01223F 4.19 9.78e-l I 203-227 


1094 


PR00354 


7Fe ferredoxin signature III 


PR00354C 6.24 8.06e-09 258-275 


1096 


IPBOO 1304 


C-type lectin domain 


IPB001304A 17.98 8.04e-14 87-111 


1096 


PR00356 


Type II antifreeze protein signature 
VII 


PR00356G 10.21 1.42e-10 193-206 


1097 


IPB001304 


C-type lectin domain 


IPB001304A 17.98 8.04e-14 87-111 
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1097 


PR00356 


Type II antifreeze protein signature 
VII 


PR00356G 10.21 8.15e-09 193-206 


1098 


PR00245 


Olfactory receptor signature V 


PR00245E 8.96 5.15e-16 283-294 
PR00245B 13.73 3.77e-15 129-141 
PR00245C 14.65 2.73e-14 176-192 
PR00245D 9.34 2.59e-13 236-245 


1098 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11,56 7.00e-12 118-129 
PR00245A 10.98 1.72e-ll 92-103 
IPB000276D 9.40 6.09e-10 282-298 


1098 


PR00534 


Melanocortin receptor family 
signature I 


PR00534A 12.77 2.83e-09 51-63 


1098 


PR00237 


Rhodopsin-like GPCR superfamily 
signature III 


PR00237C 14.77 3.86e-09 104-126 
PR00237B 12.45 6.92e-09 59-80 
PR00237A 9.81 8.3 le-09 26-50 


1099 


IPB002889 


WSC domain 


IPB002889B 11.76 3.44e-09 56-102 


1099 


IPB000561 


EGF-like domain 


IPB000561 4.89 4.86e-09 306-314 


1099 


IPB000034 


Laminin B 


IPB000034C 12.97 7.43e-09 306-324 


1099 


PR00346 


Tissue factor signature VIII 


PR00346H 10.74 8.18e-09 542-565 


1101 


PR00457 


Animal haem peroxidase signature V 


PR00457E 19.97 8.45e-24 997-1023 
PR00457D 18.35 1.53e-20 972-992 
PR00457C 18.81 9.42e- 15 954-972 
PR00457G 14.17 4.48e-14 1177-1197 
PR00457H 14.82 5.85e-13 1248-1262 
PR00457F 14.42 6.32e-12 1050-1060 


1101 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 1.00e-10 180-194 
PR00457B 12.43 2.29e- 10 802-817 


1101 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 2.80e-10 376-413 
IPB003006B 20.23 8.92e-10 466-503 
IPB003006B 20.23 9.28e-10 283-320 


1101 


PR00019 


Leucine-rich repeat signature II 


PR00019B 1 1.42 6.73e-09 73-86 


1102 


PR00457 


Animal haem peroxidase signature V 


PR00457E 19.97 8. 45 e-24 973-999 
PR00457D 18.35 1. 53e-20 948-968 
PR00457C 18.81 9.42e-15 930-948 
PR00457G 14.17 4.48e-14 1153-1173 
PR00457H 14.82 5.85e-13 1224-1238 
PR00457F 14.42 6.32e-12 1026-1036 


1102 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 1.00e-10 156-170 
PR00457B 12.43 2.29e- 10 778-793 


1102 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 2.80e-10 352-389 
IPB003006B 20.23 8.92e-10 442-479 
IPB003006B 20.23 9.28e-10 259-296 


1103 


IPB002034 


Alpha-isopropylmalate and 
homocitrate synthase 


IPB002034D 19.67 7.61e-09 786-814 


1 1 A"7 

1107 


IPBQ01359 


Synapsin 


IPB001359H 22.58 1.80e-14 741-791 


1 1 A*7 

1107 


run* Arv ao oc 

IPB0O0885 


Fibrillar collagen C-termmal domain 


IPB000885A 11.46 8.16&-09 765-802 


1107 


TTt D Art 1 A A*\ 

IP BOO 1442 


C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442A 26.12 9.14e-09 746-798 


1 110 


IPBOOlOOfi 

ii uuuJUvU 


TmmiinrtO*1 AAiilin on/H ma\r\r 

iiuniunugiuouiiu diiu major 
histocompatibility complex domain 


lrouujuvoo zu.zj j,jZC"iv ,31-00 


1112 


IPB001841 


RING finger 


IPB001841 10.69 1.95e-09 153-162 


1113 


IPB000961 


Protein kinase C-terminal domain 


IPB000961A 16.82 2.64e-12 193-227 


1113 


IPB000959 


POLO box duplicated region 


IPB000959B 15.68 9.22e-12 288-328 


1113 


IPB001245 


Tyrosine kinase catalytic domain 


PB001245A 22.45 1.87e-ll 304-344 


1113 


IPB001772 


Kinase associated domain 1 


IPB001772C 20,66 6.11e-ll 299-329 


1113 


IPB003527 


MAP kinase 


IPB003527C 14.70 3.43e-09 296-344 


1119 


PR0U37 


Gap junction alpha-8 protein (Cx50) 
signature II 


PR01137B 18.37 8.83e-09 368-380 
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1120 


IPB000906 


ZU5 domain 


IPB000906G 25.85 2.58e-13 921-969 
IPB000906F 35.93 9.00e-12 931-984 
1PB000906D 23.89 1.57e-ll 940-994 


1120 


PR00452 


SH3 domain signature II 


PR00452B 11.47 2.73e-ll 1036-1051 


1120 


PR01415 


Ankyrin repeat signature I 


PR01415A 12.73 6.46e-ll 954-966 
IPB000906A 22.49 7.53e-10 914-956 
PR01415A 12.73 7.97e-10 921-933 


1 i?n 


PPHfi/lOQ 


Neutrophil cytosol factor 2 signature 
IV 


PR00499D 11.47 4.2 le-09 1024-1044 


1 1ZU 




Involucrin 


IPB002360C 15.36 4.90e-09 125-166 
IPB000906F 35.93 7.41e-09 898-951 


1 1 ?n 

1 1ZU 


lrtJUUUZJ / 


UKir domain 


IPB000237B 30.66 8.14e-09 142-192 


1124 


IPB000906 


ZU5 domain 


IPB000906D 23.89 7.66e-10 1 17-171 
IPB000906A 22.49 3.72e-09 58-100 
IPB000906G 25.85 6.69e-09 164-212 


1 i?s 

1 LZJ 


lr D\)\J\Jy\J\) 


z,Uj oomain 


IPB000906D 23.89 7.66e-10 117-171 
IrtJUOUyOoA 22.49 3.72e-09 58-100 


1129 




vudguidiion idccor -)/o type 
domain (FA58C) 


lrt>UUU4ZlC 3o.74 1.93e-lo 131-175 
IPB000421B 20.70 1.36e-14 79-99 


i no 




^oaguiauon iactor o/o type c 
domain (FA58C) 


1PBU0042IC 36.74 l.93e-!6 131-175 
IPB000421B 20.70 l.36e-l4 79-99 


1130 


PRO 1435 


NADH-plastoquinone 
UAiuurcuucidse cnain o signature 11 


PR01435B 5.98 7.37e-l0 1059-1083 


1131 


IPB002119 


Histone H2A 


IPB002H9A4.97 l.00e-08 92-98 


1 133 




Tyrosine kinase catalytic domain 


IPBG01245B 21,68 4.43e-18 178-216 


1133 


IPB003527 


MAP kinase 


IPB003527D 21.53 3.41e-16 171-212 


1133 


IPB000961 


Protein kinase C-terminai domain 


IPB000961A 16.82 6.56e-15 10-44 


1133 


IPB000861 


PKN/rhophilin/rhotekin rho-binding 
repeat 


IPB000861D 13.61 6.92e-15 8-44 


1 1 ii 
1 1 j j 




POLO box duplicated region 


IPB000959C 23.49 6.34e-14 153-205 
IPB003527G 17.26 4.28e-13 320-357 
IPB001245A 22.45 8.07e-13 1 19-159 ! 


1133 


IPB001772 


Kinase associated domain 1 


IPB001772C 20.66 4.51e-12 1 14-144 
IPB000861G 13.73 5.06e-12 180-229 


1133 


IPB000095 


PAK-box /P21 -Rho-binding 


IPB000095F 16.47 1.18e-ll 182-236 
IPB000961D 21.23 1.00e-10 174-215 
IPB001772A 13.64 I.86e-10 8-39 
IPB003527A 17.00 2.75e-10 17-42 
IPB000959B 15.68 9.10e-10 103-143 


1 i^s 
1 1 j j 


riwUH-uz 


l ec/oiK domain signature 1 


PR00402A 20.14 8.15e-15 664-683 
PR00402B 12.26 4.69e-13 683-695 


1135 


PR00360 


C2 domain signature II 


PR00360B 11.64 9.25e-13 174-187 
PR00402C 13.13 8.03e-12 695-708 


1 1 j j 


IroUUUUUo 


cz domain 


IPB000008D 14.83 1.61e-ll 200-218 
PR00360A 15.18 6.00e-10 150-162 
PR00360A 15.18 8.33e-10 22-34 


1135 


PR00399 


Synaptotagmin signature IV 


PR00399D 12.72 4.89e-09 79-89 
PR00360C 7 35 5 50e-0Q 1 0^>-?04 


1137 


IPB003886 


Extracellular domain in nidogen 


IPB003886D 13.91 8.57e-15 261-280 


1137 


IPB000152 


Aspartic acid and asparagine 
hydroxyiation site 


IPB000152 8.86 7.16e-14 134-149 
IPB000152 8.86 9.05e-14 216-231 
IPB000152 8.86 5.91e-13 261-276 


1137 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881B 12.28 9.25e-13 216-227 


1137 


IPB001774 


Delta serrate ligand 


IPB001774C 18.25 9.69e-12 66-108 
IPB001881B 12.28 1.95e-ll 134-145 


1137 


IPB000033 


"Low-density lipoprotein (Idl) 
receptor, YWTD repeat" 


IPB000033B 7.05 4.96e-ll 266-276 
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1137 


PR01217 


Proline rich extensin signature VII 


PR01217G4.02 5.l5e-ll 340-365 


1137 


PR00907 


Thrombomodulin signature II 


PR00907B U.50 6.70e-ll 168-184 
IPB001881B 12.28 l.OOe- 10 261-272 


1137 


IPB000925 


Pneumovirus attachment 
glycoprotein G 


IPB000925F 15.07 3.60e-10 336-372 


1137 


IPB000561 


EGF-like domain 


IPB000561 4.89 6.25e-10 75-83 


1137 


PR00010 


Type II EGF-like signature III 


PR00010C 6.98 1.66e-09 266-276 


1137 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 3.29e-09 348-362 
PR00049D 0.00 3.29e-09 350-364 
IPB000033B 7,05 3.84e-09 221-231 
PR01217E 3.04 4.48e-09 348-364 
PR01217B 4.82 6.55e-09 347-363 
IPB000561 4.89 6.79e-09 270-278 
PR00010C6.98 7.15e-09 139-149 
PR01217D 4.57 7.16e-09 343-364 
PR00010C 6.98 7.80e-09 221-231 
IPB000033B 7.05 8.1 le-09 139-149 


1137 


IPB003367 


Thrombospondin type 3 repeat 


IPB003367A 11.78 8.62e-09 183-203 


1137 


PR00910 


Luteovirus ORF6 protein signature I 


PR00910A 2.74 8.71e-09 348-360 
PR00910A 2.74 9.46e-09 346-358 
PR01217G 4.02 9.92e-09 343-368 


1138 


IPB001156 


Transferrin 


IPB001156H 23.81 7.75e-09 118-172 


1143 


PR00245 


Olfactory receptor signature III 


PR00245C 14.65 9.53e-17 59-75 


1143 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11.56 9.25e-14 1-12 
PR00245D 9.34 1.53e-13 119-128 
PR00245E 8.96 6.81e-12 166-177 
PR00245B 13.73 1.00e-10 12-24 
IPB000276D 9.40 3.08e-09 165-181 


1143 


PR00237 


Rhodopsin-like GPCR superfamily 
signature V 


PR00237E 13.03 3.83e-09 82-105 
PR00237G 19.23 1.00e-08 155-181 


1144 


PR00245 


Olfactory receptor signature III 


PR00245C 14.65 9.53e-17 173-189 | 


1144 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 1 1.56 9.25e-14 1 17-128 ! 
PR00245D 9.34 1.53e-13 233-242 
PR00245E 8.96 6.81e-12 280-291 
PR00245A 10.98 7. 14e-12 91-102 
PR00245B 13.73 8.14e-10 128-140 


1144 


PR00237 


Rhodopsin-like GPCR superfamily 
signature III 


PR00237C 14.77 2.02e-09 103-125 
IPB000276D 9.40 3.08e-09 279-295 
PR00237E 13.03 3.83e-09 196-219 


1144 


PR00534 


Melanocortin receptor family 
signature I 


PR00534A 12.77 5. 17e-09 50-62 


1144 


PR00896 


Vasopressin receptor signature II 


PR00896B 9.36 7.23e-09 54-65 
PR00237G 19.23 1.00e-08 269-295 


1146 


IPB000017 


Syntaxin / epimorphin family 


IPB000017 23.80 1.84e-09 168-217 


1147 


PR01360 


Interleukin-1 receptor antagonist 
precursor IL-1RA signature VI 


PR01360F 14.44 3.11e-12 117-135 
PR01360C 10.33 4.84e-il 58-75 


1147 


IPB000975 


Interleukin-1 


IPB000975D 24.45 5,55e-09 52-91 
IPB000975E 28.12 9.80e-09 96-135 


1147 


PR00264 


Interleukin-1 precursor family 
signature I 


PR00264A 18.63 1.00e-08 55-75 


1148 


PR01248 


Type I keratin signature V 


PR01248E 12.72 3.67e-21 248-274 


1148 


IPB001664 


Intermediate filament proteins 


IPB001664B 17.44 9. 16e-20 104-143 - 
IPB001664A 11.94 8.13e-19 381-406 
PR01248C 10.07 8.34e-17 150-170 


1148 


IPB001322 


Intermediate filament tail domain 


IPB001322A 30.52 2.23e-14 370-423 
IPB001664C 11.32 3.25e-13 161-188 
PR01248B 8.42 3.29e-13 96-119 
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PR01248D 9.34 3.60e-12 222-237 
PR01248A 8.12 6.14e-ll 75-88 


1148 


PR01177 


Metabotropic gamma-aminobutyric 
acid type Bl receptor signature X 


PR01177J 6.10 4.96e-10 397-415 
PR01177J6.10 3.63e-09 13-31 
IPB001664D 12.63 5.36e-09 279-305 


1151 


IPB001664 


Intermediate filament proteins 


IPB001664D 12.63 4.75e-28 384-410 


1151 


PR01276 


Type II keratin signature IV 


PR01276D 13.08 8.3 le-24 222-241 
IPB001664A 11.94 9.50e-23 132-157 


1151 


IPB001322 


Intermediate filament tail domain 


IPB001322C 22.70 4.75e-22 374-419 
IPB001664C 11.32 8.20e-21 266-293 
PR01276E 12.04 4.75e-15 301-318 
IPB001322A 30.52 4.08e-14 121-174 
PR01276F 10.92 3.21e-ll 352-367 
PR01276C 10.16 8.66e-ll 208-221 
IPB001664B 17.44 5.27e-10 191-230 
PR01276B 9.79 5.96e-10 161-173 
PR01276A 10.31 7.16e-10 134-142 


1151 


IPB003743 


DUF164 


IPB003743B 20.16 9.21e-10 300-338 


1152 


IPB001818 


Matrixin 


IPB001818C 24.38 8.03e-32 157-202 
IPB001818B 26.48 6,04e-31 112-153 
IPB001818A 14.60 2.13e-29 66-95 
IPB001818H 15.46 3.25e-23 332-358 
IPB001818F 11.19 4.91e-20 231-251 


1152 


PR00138 


Matrixin signature I 


PR00138A 12.54 1.64e-16 86-99 
PR00138C 20.07 1.78e-16 155-183 
IPB001818G 14.71 1.96e- 12 268-280 
PR00138B 14.84 5.2le-10 131-146 


1153 


IPB0O1818 


Matrixin 


IPB001818C 24.38 8.03e-32 157-202 
IPB001818B 26.48 6.04e-31 112-153 
IPB001818A 14.60 2.13e-29 66-95 
IPB001818H 15.46 3.25e-23 332-358 
IPB001818F 11.19 4.91e-20 231-251 


1153 


PR00138 


Matrixin signature I 


PR00138A 12.54 1.64e-16 86-99 
PR00138C 20.07 1.78e-16 155-183 
IPB001818G 14.71 1.96e-12 268-280 
PR00138B 14.84 5.21e-10 131-146 


1154 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 2.07e-09 10-24 


1154 


IPB002000 


Lysosome-associated membrane 
glycoprotein (Lamp) 


IPB002000D 5.87 5.25e-09 12-25 


1155 


IPB001124 


Lipid-binding serum glycoprotein 


IPB001124C 25.71 7.71e-17 210-253 
IPB001124D 21.85 5.7 le-1 4 274-310 


1156 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135B 13.24 9.39e-10 84-128 
IPB000135A 11.69 6.19e-09 111-165 


1156 


IPB003533 


Doublecortin 


IPB003533H 6.52 7.51e-09 49-72 


1159 


IPB001510 


Poly(ADP-ribose) polymerase zinc 
finger domain 


IPB001510D 30.92 1.00e-40 490-543 
IPB001510E 22.53 1.00e-40 570-624 
IPB001510A 34.80 7.21e-40 92-137 
IPB001510B 23.09 6.14e-34 306-348 
IPB001510C 15.91 6.54e-27 363-396 


1159 


IPB000977 


ATP-dependent DNA ligase 


IPB000977B 14.05 4.60e-13 508-517 
IPB000977C7.51 l.OOe- 12 590-599 
IPB000977A 8.89 1.47e-09 480-487 


1160 


IPB000215 


Serpins 


IPB000215E 15.36 5.50e-23 401-425 
IPB000215D 15.35 6.82e-21 317-343 
IPB000215A 13.01 7.43e-18 27-50 
IPB000215C 13.90 3.i6e-12 207-221 
IPB000215B 9.87 9.59e-ll 178-190 
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1166 


IPB001309 


ICE-like protease (caspase) p20 
domain 


IPB001309A 10.71 3.57e-14 7-17 


1166 


P-R00376 


Interleukin-IB converting enzyme 
signature I 


PR00376A 12.81 1.61e-10 5-18 


1168 


EPB000364 


Phosphoenolpyruvate carboxykinase 
(GTP) 


IPB000364M 26.08 1.40e-09 589-623 


1169 


IPB001304 


C-type lectin domain 


IPB001304A 17.98 6.50e-17 118-142 


1171 


PR00320 


G protein beta WD-40 repeat 
signature 11 


PR00320B 12.82 6.62e-13 478-492 


1171 


PR00308 


Type I antifreeze protein signature I 


PR00308A 3.72 8.17e-13 158-172 
PR00320A 13.15 2.89e-12 478-492 
PR00320C 12.32 4.18e-12 247-261 
PR00320C 12.32 4.71e-12 478-492 
PR00320B 12.82 7. 75e- 12 247-261 
PR00320A 13.15 8.1 le-12 427-441 
PR00320A 13.15 9.05e-12 247-261 
PR00308B 3.38 9.27e-12 161-172 
PR00308A 3.72 9.76e-l2 162-176 
PR00308C2.791.00e-ll 161-170 


1171 


PR01511 


Kvl.4 voltage-gated K+ channel 
signature IV 


PR01511D 3.91 3.02e-ll 163-173 
PR00320C 12.32 3.57e-l 1 427-441 
PR00320B 12.82 5.09e-ll 520-534 
PR00320B 12.82 7.14e-tl 427-441 
PR00320A 13.15 7.55e-ll 520-534 
PR00320C 12.32 4.52e-10 520-534 


1171 


PR00833 


Pollen allergen Poa pi signature VIII 


PR00833H 2.61 8.56e-10 164-178 
PR00308C 2.79 8.77e-10 165-174 
PR01511D 3.91 9.88e-10 159-169 


1171 


1PB001680 


G-protein beta WD-40 repeats 


IPB001680 10.43 1.45e-09 429-440 
PR00308B3.38 1.76e-09 165-176 
IPB001680 10.43 3. 70e-09 480-491 
IPB001680 10.43 4.15e-09 249-260 


1171 


PR00456 


Ribosomal protein P2 signature V 


PR00456E 3.08 5.08e-09 163-177 
PR00308A 3.72 6.74e-09 159-173 
PR00320A 13.15 7.75e-09 303-317 
PR00833H 2.61 7.78e-09 161-175 
PR00320B 12.82 8.45e-09 344-358 


1171 


IPB000102 


Neuraxin / MAP1B repeat 


IPB000102A 10.50 8.88e-09 156-184 
IPB001680 10.43 9.10e-09 522-533 
IPB000102A 10.50 9.22e-09 160-188 
PR00308B 3.38 9.75e-09 162-173 


1175 


IPB001559 


Phosphotriesterase family 


IPB001559D 19.17 5.00e-20 176-202 
IPB001559C 16.25 5.34e-16 141-162 
IPB001559E 16.18 5.35e-l6 214-232 
IPB001559A 10.81 1.23e-ll 18-29 
IPB001559B 12.98 8.50e-10 122-132 \ 


1183 


IPB003817 


Phosphatidylserine decarboxylase 


IPB003817D 23.34 8.71e-25 338-364 
IPB003817C 10,66 4.00e-15 316-328 
IPB003817E 13.21 2.67e-l4 427-443 
IPB003817A 12.64 4.15e-13 162-176 


1184 


IPB000580 


TSC-22 / Dip / Bun family 


IPB000580 14.33 1.00e-40 116-170 


1185 


PR00072 


Malic enzyme signature IV 


PR00072D 12.09 9.29e-09 571-589 


1187 


PR00901 


Pheromone B alpha- 1 receptor 
signature VIII 


PR00901H 14.75 4.05e-09 56-66 


1188 


IPB002469 


"Dipeptidyl peptidase IV, N- 
terminus" 


IPB002469I 10.99 4.86e-16 747-765 
IPB002469H 21.17 6.14e-16 702-737 
IPB002469J 8.97 3.52e-12 829-845 
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1188 


IPB002471 


Prolyl endopeptidase family serine 
active site 


IPB002471B 24.90 3.66e-ll 734-765 
IPB002469G 26.76 9.24e-ll 657-695 


1189 


IPB002469 


"Dipeptidyl peptidase IV, N- 
terminus" 


IPB0024691 10.99 4.86e-16 747-765 
IPB002469H 21.17 6.14e-16 702-737 
IPB002469J 8.97 3.52e-12 791-807 


1189 


IPB002471 


Prolyl endopeptidase family serine 
active site 


IPB002471B 24.90 3.66e-l 1 734-765 
IPB002469G 26.76 9.24e-ll 657-695 


1190 


IPB002469 


"Dipeptidyl peptidase IV, N- 
terminus" 


IPB002469I 10.99 4.86e-16 734-752 
IPB002469H 21.17 6.14e-16 689-724 
IPB002469J 8.97 3.52e-l2 816-832 


1190 


IPB002471 


Prolyl endopeptidase family serine 
active site 


IPB002471B 24.90 3.66e-ll 721-752 
IPB002469G 26.76 9.24e-ll 644-682 


1191 


IPB000524 


"Bacterial regulatory proteins, GntR 
family" 


IPB000524 18.80 7.19e-10 54-94 


1193 


IPB000906 


ZU5 domain 


IPB000906A 22.49 6.14e-19 241-283 
IPB000906F 35.93 3.09e-16 159-212 
IPB000906F 35.93 7.91e-16 192-245 


1193 


PR01415 


Ankyrin repeat signature 1 


PR01415A 12.73 3.70e-15 348-360 
IPB000906A 22.49 1.71e-14 142-184 
PR01415A 12.73 9.10e-13 799-811 
IPB000906F 35.93 1.00e-12 442-495 
IPB000906A 22.49 5.66e-12 208-250 
IPB000906G 25.85 9.36e-12 149-197 
PR01415A 12.73 1.00e-ll 1 


1194 


PR00834 


HtrA/DegQ protease family signature 

in 


PR00834C 15.48 7.35e-l 9 253-277 
PR00834D 11.75 7.39e- 17 291-308 
PR00834B 10.17 3.25e-13 212-232 
PR00834E 13.43 6.03e-12 313-330 


1194 


IPB000126 


"Serine proteases, V8 family" 


IPB000126B 12.50 6.81e-12 296-312 
PR00834A8.79 1.44e-ll 191-203 
PR00834F 11.11 1. 53e-09 374-386 
IPB000126A 11.75 9.83e-09 183-198 


1195 


PR00424 


Adenosine receptor signature IV 


PR00424D 13.35 4.34e-22 21-40 


1195 


PR00555 


Adenosine A3 receptor signature V 


PR00555E 7.35 4.75e-21 105-122 
PR00555F 1 1.48 2.74e-20 152-169 
PR00555D 10.79 9.36e-19 60-76 
PR00424E 14.23 3.75e-14 74-87 


1195 


PR00237 


Rhodopsin-like GPCR superfamily 
signature VII 


PR00237G 19.23 4.21e-14 119-145 
PR00237F 14.34 9.28e-14 83-107 
PR00237E 13.03 4.60e-12 33-56 


1195 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276D 9.40 7.30e-12 129-145 
PR00424F 8.75 9.07e-12 119-129 


1197 


PR00245 


Olfactory receptor signature IV 


PR00245D 9.34 1.53e-13 241-250 
PR00245C 14.65 1.56e-12 181-197 


1197 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11.56 5.20e-12 123-134 


1197 


PR00237 


Rhodopsin-like GPCR superfamily 
signature III 


PR00237C 14.77 6.73e-ll 109-131 
PR00245E 8.96 3.30e-10 288-299 
PR00237E 13.03 4.77e-10 204-227 
PR00245A 10.98 3.65e-09 97-108 
PR00245B 13.73 4.60e-09 134-146 


1197 


PR00534 


Melanocortin receptor family 
signature I 


PR00534A 12.77 8.43e-09 56-68 


1198 


PR00505 


D12 class N6 adenine-specific DNA 
methyltransferase signature I 


PR00505A 15.44 3.67e-12 30-46 
PR00505B 11.79 8.88e-l2 51-65 


1199 


PR01254 


Prostaglandin D synthase signature I 


PR01254A 12.32 6.38e-10 25-48 


1199 


PR00179 


Lipocalin signature II 


PR00179B 7.67 2.35e-09 111-123 
PR00179A 13.97 5.80e-09 31-43 
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PR00179C 17.26 6.70e-09 138-153 


1199 


PR01174 


Retinol binding protein signature VI 


PR01174F 11.76 6.82e-09 110-126 
PR01254E 14.07 8.23e-09 135-149 


1199 


PR01275 


Neutrophil gelatinase lipocalin 
signature II 


PR01275B 9.02 1.00e-08 33-43 


1200 


PR01042 


Aspartyl-tRNA synthetase signature 
IV 


PR01042D 1 1.70 2.67e-14 432-446 
PR01042B 12.76 4.69e-ll 233-246 
PR01042C 16.81 5.50e-ll 393-409 
PR01042A 9.01 9.77e-10 217-229 


1200 


IPB002106 


Aminoacyl-transfer RNA synthetases 
ciass-II 


IPB002106A 13.35 1.00e-08 169-181 


1201 


PR01217 


Proline rich extensin signature VII 


PR01217G4.02 8.03e-09 528-553 


1202 


IPB003952 


Fumarate reductase / succinate 
dehydrogenase FAD-binding site 


IPB003952E 9.04 2.46e-16 31-48 


1203 


IPB001895 


Guanine-nucleotide dissociation 
stimulators CDC25 family 


IPB001895C 20.83 8.50e-23 297-332 


1204 


1PB000958 


KH domain 


IPB000958 6.84 5.09e-12 1 12-125 
IPB000958 6.84 2.29e-l 1 28-41 
IPB000958 6.84 7.88e-10 276-289 


1207 


IPB001393 


Calsequestrin 


IPB001393A 16.72 1.00e-40 29-78 
IPB001393B 11.93 1.00e-40 132-185 
IPB001393C 16.33 1.00e-40 188-240 
IPB001393D 11.26 1.00e-40 283-335 


1207 


PR00312 


Calsequestrin signature V 


PR00312E 8.61 7.75e-36 163-192 
PR00312I 15.97 5.71e-35 326-354 
PR00312F 16.12 7.87e-35 193-222 
PR00312H 13.19 2.80e-34 257-284 
PR00312J 13.61 6.48e-34 357-385 
PR00312D9.10 7.17e-33 122-151 
PR00312B 14.57 4.41e-32 56-85 
PR00312C 16.48 5.62e-32 86-115 
PR00312G 11.43 1.49e-31 224-251 
PR00312A 1 1.96 7.94e-27 29-52 


1209 


IPB002151 


Kinesin light chain repeat 


IPB002151A 11.63 5.55e-l0 275-305 


1209 


PR00985 


Leucyl-tRNA synthetase signature I 


PR00985A 10.14 8.25e-09 5 15-532 


1210 


1PB000353 


"Class II histocompatibility antigen, 
beta chain, beta-1 domain" 


IPB000353B 19.16 7.89e-16 137-186 


1210 


1PB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006A 17.51 7.63e-15 158-180 


1210 


IPB001003 


"MHC Class II, alpha chain, alpha- 1 
domain" 


IPB001003B 14.72 3.87e-10 145-188 


1213 


PR00205 


Cadherin signature II 


PR00205B 20.09 8.31e-23 244-273 


1213 


IPB002126 


Cadherin domain 


IPB002126B 12.04 5.80e-16 232-249 
PR00205D 12.22 7.26e-15 436-455 
PR00205F 19.57 1.64e-14 515-541 
PR00205G 13.05 4.86e-14 549-566 
PR00205A 17.38 7.88e-14 75-94 
PR00205D 12.22 3.40e-13 331-350 
PR00205D 12.22 5.80e-l3 223-242 


1214 


IPB001580 


Calreticulin family 


IPB001580D 12.66 2.71e-38 259-294 
IPB001580B 18.74 1.90e-35 166-201 


1214 


PR00626 


Calreticulin signature IV 


PR00626D 7.86 9.00e-30 242-264 
IPB001580A 12.93 8.71e-28 91-113 
PR00626E 10.35 4.68e-23 280-299 
PR00626B 14.56 6.06e-20 126-142 
PR00626E 10.35 8.00e-19 266-285 
PR00626A 14.93 6.50e-18 100-118 
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PR00626C 9.33 8.71e-18 215-228 
IPB001580C 9.76 1.56e-17 242-254 
IPB001580D 12.66 2.38e-16 245-280 
IPB001580D 12.66 8.34e-16 273-308 
IPB001580C 9.76 4.30e-15 208-220 
IPB001580C 9.76 4.16e-14 225-237 
PR00626C 9.33 7.75e-12 232-245 
PR00626D 7.86 9.14e-09 208-230 


1215 


IPB000006 


"Vertebrate metallothionein, family 

r 


IPB000006 13.41 3.90e-12 32-77 
IPB000006 13.41 4.41e-12 39-84 
IPB000006 13.41 6.70e-ll 35-80 


1215 


PR01228 


Eggshell protein signature III 


PR01228C 5.69 1.22e-10 26-41 
PR01228C5.69 1.98e-10 10-25 


1215 


EPB001271 


Mammalian defensin 


IPB001271 19.97 3.29e-10 51-79 


1215 


IPB002494 


"Keratin, high sulfur B2 protein" 


IPB002494C 14.46 3.36e-10 45-88 
IPB001271 19.97 3.47e-10 29-57 
IPB002494A 12.44 6.11e-10 70-103 


1215 


IPB002174 


Furin-like cysteine rich region 


IPB002174A 30.51 7.32e-10 11-42 
IPB002174A 30.51 7.81e-10 3-34 
PR01228C 5.69 8.05e-10 19-34 


1215 


EPB003571 


Snake toxin 


IPB003571B 18.08 8.07e-10 76-99 ! 
IPB002494A 12.44 9.08e-10 25-58 


1215 


PR00858 


Crustacean metallothionein signature 
II 


PR00858B 5.93 1.48e-09 40-58 
IPB000006 13.41 3.11e-09 36-81 


1215 


IPB001169 


"Integrin beta, C-terminus" 


IPB001 169K 27.45 3.19e-09 42-84 


1215 


IPB002919 


Trypsin Inhibitor-like cysteine rich 
domain 


IPB002919A 15.56 3.57e-09 52-64 
IPB002174A 30.51 4.15e-09 27-58 
IPB001271 19.97 4.44e-09 58-86 
IPB002494A 12.44 4.97e-09 32-65 
PR01228C 5.69 5.03e-09 18-33 
PR01228C 5.69 5.03e-09 22-37 
IPB002174A 30.51 5.28e-09 19-50 


1215 


IPB000254 


"Cellulose-binding domain, fungal 
type" 


IPB000254 18.11 5.36e-09 28-58 
IPB000006 13.41 5.59e-09 42-87 
IPB002174A 30.51 5.72e-09 36-67 
PR01228C 5.69 5.76e-09 27-42 


1215 


IPB000564 


2Fe-2S Ferredoxin 


IPB000564A 17.31 6.49e-09 1-19 


1215 


IPB000867 


Insulin-like growth factor-binding 
protein 


IPB000867B 1 1.44 6.55e-09 5-21 
IPB002174A 30.51 6.62e-09 7-38 


1215 


IPB002867 


Cysteine-rich domain (C6HQ 


IPB002867D 24.88 7.19e-09 38-69 
IPB000006 13.41 7.24e-09 50-95 


1215 


IPB000967 


Zinc finger NF-Xl type 


IPB000967D 10.42 7.37e-09 60-95 
IPB001169K 27.45 7.81e-09 35-77 
IPB000006 13.41 8.07e-09 3-48 
IPB000006 13.41 8.07e-09 40-85 
IPB002494A 12.44 8.35e-09 29-62 
IPB000006 13.41 8.44e-09 55-100 


1215 


PR01117 


CLC-6 chloride channel signature I 


PR01 1 17A 7.79 9.47e-09 51-63 
IPB001271 19.97 9.5 le-09 67-95 
IPB002174A 30.51 9.77e-09 39-70 


1215 


IPB002221 


WAP-type (Whey Acidic Protein) 
four-disulfide core domain 


IPB002221B 17.12 1.00e-08 48-69 


1218 


PR00946 


Mercury scavenger protein signature 
I 


PR00946A 4.14 8.16e-09 6-24 


1221 


IPB002038 


Osteopontin 


IPB002038C 22.35 1.00e-40 119-160 


1221 


PR00216 


Osteopontin signature I 


PR00216A 11.45 9.71e-34 2-31 
IPB002038B 15.58 2.06e-32 23-67 
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PR00216C 9.12 5.85e-32 41-66 
IPB002038A 12.23 5.l5e-31 1-30 
PR00216G 12.73 8.50e-30 231-256 
PR00216F 12.92 1.62e-22 152-170 
PR00216D 3.16 3.30e-18 88-102 
rR00216E 6.95 3.81e-18 120-134 
lrb002038D 9.52 5.50e-l7 248-263 
PR00216D 3.16 3.69e-l2 82-96 


1221 


IPB003403 


Herpesvirus immediate early protein 


IPB003403E 17.25 9.26e-09 63-90 


1222 


IPB000215 


Serpins 


IPB000215A 13.01 9.l4e-l8 107-130 
IPB000215D 15.35 3.74e-17 332-358 
IPB000215E 15.36 6.68e-16 419-443 
IPB000215C 13.90 7.88e-15 229-243 


1223 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 3.52e-10 279-316 
IPB003006A 17.51 7.75e-09 141-163 


1225 


IPB001241 


DNA topoisomerase II family 


IPB001241F 23.94 8.36e-37 399-447 


1225 


PR01158 


Topoisomerase II signature VIII 


PR01158H 13.39 5.50e-30 728-750 
IPB001241G 14.13 1.00e-29 471-497 
PR01 158K 14. 14 5.24e-27 947-973 
PR01 158G 9.37 5.91e-27 681-704 


1225 


IPB002205 


"DNA gyrase/topoisomerase IV, 
subunit A" 


IPB002205B 14.49 4.79e-24 684-719 
IPB001241E 20.94 3.00e-22 295-321 
PR011581 13.95 7.00e-22 758-778 
PR01158D 11.94 5.24e-21 489-504 


1225 


PR00418 


DNA topoisomerase II family 
signature VI 


PR00418F 13.13 3.40e-20 470-486 
IPB001241B 10.04 2.71e- 19 96-114 
PR00418G 12.91 8.94e-19488-505 
IPB001241H 17.27 1.96e-18 732-755 


1225 


PR00615 


CCAAT-binding transcription factor 
subunit A signature I 


PR00615A 17.09 2.93e-18 243-261 
PR01158J 13.56 3.45e-18 863-877 
IPB002205D 10.13 3.54e-18 791-812 
PR00615B 18.03 3.77e-18 631-649 
PR00418C 9.38 1.82e-17 100-114 
PR004181 17.21 4.60e-17 550-566 
IPB002205A 8.13 9.54e-17 653-671 
PR00418A 13.58 7.65e-I6 20-35 
PR01158C 11.35 1.00e-15 443-456 
PR01158E 8.11 2.29e-15 509-520 
PR01158F 10.39 4.71e-i5 556-568 
PR00615C 17.93 8.50e-15 1072-1090 1 
PR00418E 14.82 1.37e-14 397-411 
IPB001241D 14.87 1.43e-14 252-265 
PR00418B 12.37 2.57e-14 57-70 
PR00418D 14.25 2.71e-14 252-265 
PR01158A 7.61 4.60e-13 380-390 
IPB002205C 11.89 5,09e-12 736-750 
PR00418H 10.58 5.9 le-1 2 508-520 
IPB001241C 13.37 1.31e-li 154-166 


1225 


IPB000509 


Ribo<5oma1 nrntpin T 


PR01158B 8.30 1.27e-10 395-402 


1225 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D2.13 5.64e-09 1286-1310 
IPB000135D 2.13 7.45e-09 1287-1311 
IPB000135D 2.13 8.09e-09 1288-1312 


1225 


PR01469 


Bacterial carbamate kinase signature 
V 


PR01469E 10.60 8.43e-09 52-70 \ 
IPB000135D 2.13 8.73e-09 1284-1308 


1226 


IPB000873 


AMP-dependent synthetase and 
ligase 


IPB000873A 11.08 1.50e- 12 248-263 


1226 


PR00154 


AMP-binding signature I 


PR00154A 8.79 5.14e-09 241-252 



WO 2004/080148 



PCTYUS2003/030720 



420 
TABLE 3B 



1227 


IPB001043 


"Vinculin, type 1" 


IPB001043E 22.70 9.08e-09 136-173 


1228 


IPB001073 


Complement Clq protein 


IPB001073B 20.88 3.48e-24 96-130 
IPB001073C 13.07 4.50e-13 163-182 
IPB001073A 22.14 6.55e-13 42-76 


1228 


PR00007 


Complement C1Q domain signature 
II 


PR00007B 15.63 9.56e-13 116-135 
IPB001073D7.60 1.00e-ll 195-204 
PR00007D9.66 2.00e-ll 193-203 
PR00007C 16.13 7.38e-ll 163-184 
PR00007A 20.64 3.04e-10 89-115 


1230 


IPB000906 


ZU5 domain 


IPB000906A 22.49 1.99e-15 274-316 


1230 


PR01415 


Ankyrin repeat signature I 


PR01415A 12.73 3.70e-15 381-393 
IPB000906G 25.85 6.04e-12 900-948 
IPB000906A 22.49 2.24e-l 1 893-935 
PR01415A 12.73 1.00e-10 281-293 
IPB000906F 35.93 1.61e-10 225-278 
PR01415A 12.73 2.45e- 10 796-808 
IPB000906D 23.89 3.88e-10 3 


1230 


PR00665 


Oxytocin receptor signature V 


PR00665E 6.24 6.76e-09 756-769 
IPB000906E 22.11 7.22e-09 278-318 
PR01415B 10.23 7. 75e-09 260-272 
PR01415B 10.23 9. 25e-09 227-239 


1231 


IPB001124 


Lipid-binding serum glycoprotein 


IPB001124C 25.71 7.71e- 17 210-253 
IPB001124D 21.85 5.71e-14 274-310 


1232 


IPB001124 


Lipid-binding serum glycoprotein 


IPB001 124C 25.71 7.71e-17 210-253 
IPB001124D 21.85 5.71e-14 274-310 


1233 


IPB001124 


Lipid-binding serum glycoprotein 


IPB001124C 25.71 7.71e-17 210-253 
IPB001 124D 21.85 5.71e-14 274-310 


1234 


PR00053 


Fork head domain signature II 


PR00053B 12.24 8. 50e-09 523-540 


1236 


IPB000258 


Bacterial ice-nucleation proteins 
octamer repeat 


IPB000258G 8.61 7.77e-09 92-145 


1237 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 6.57e-13 253-290 


1240 


IPB001627 


Sema domain 


IPB001627F 22.05 5.09e-29 255-288 
IPB001627G 21.49 2.17e-28 311-344 
IPB001627C 21.13 1.22e-21 162-193 
IPB001627B 18.84 1.79e-2i 117-145 


1240 


IPB002165 


Plexin repeat 


IPB002165C 18.49 3.45e-l9 255-287 
IPB0016271 10.67 6.57e-15 386-399 
IPB001627A 16.97 5.26e-14 98-113 
IPB001627H 10.22 1.35e-13 358-370 
IPB001627K 13.76 7.92e-13 524-536 
IPB001627J 11.43 1.22e- 12 436-452 
IPB002165C 18.49 3.64e-I2 254-286 
IPB002165D 14.72 3.65e-12 524-536 
IPB001627D 16.04 6.70e-12 209-224 
IPB002165B 13.59 7.57e-12 136-145 
IPB001627E 8.70 9.59e-12 230-239 


1247 


PR00011 


Type III EGF-like signature IV 


PROOOllD 12.12 8.93e-16 767-785 
PROOOllD 12.12 1.00e-15 550-568 
PROOOllB 13.08 5.06e-15 767-785 
PROOOllB 13.08 6.65e-l 5 289-307 
PROOOllD 12.12 6.67e-15 289-307 
PROOOUA 14.05 2.53e-14 289-307 
PROOOllD 12.12 5.86e-14 638-656 
PROOOllB 13.08 8.50e-14 550-568 
PROOOllB 13.08 1.93e-13 160-178 
PROOOllB 13.08 2.55e-13 203-221 
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PROOOllB 13.08 2.86e-13 421-439 
PROOOllD 12.12 3.83e-13 378-396 
PROOOllD 12.12 6.00e-13 421-439 
PROOOllA 14.05 7.83e-13 378-396 
PROOOllA 14.05 9.53e-13 203-221 
PROOOllB 13.08 9,53e-13 378-396 
PROOOllD 12.12 1.00e-12 810-828 
PROOOllB 13.08 1.59e-l2 810-828 
PROOOllA 14.05 2.05e-12 550-568 
PROOOllD 12.12 3.02e-12 203-221 
PROOOl IB 13.08 4.84e-12 638-656 
PROOOllD 12.12 5.50e-12 160-178 
PROOOllD 12.12 7.67e-12 507-525 


1247 


IPB000561 


EGF-like domain 


IPB000561 4.89 7.75e-12 210-218 
PROOOllD 12.12 8.29e-l 2 332-350 
PROOOllA 14.05 8.65e-12 421-439 
PROOOllA 14.05 1.55e-ll 767-785 
PROOOllD 12.12 1.73e-ll 593-611 
PROOOllA 14.05 3.08e-ll 638-656 
PROOOl IB 13.08 5.43e-l 1 593-611 
PROOOl ID 12.12 6.66e-l 1 464-482 
PROOOllB 13.08 7.78e-ll 332-350 
PROOOllD 12.12 7.82e-ll 724-742 


1247 


IPB000034 


LamininB 


IPB000034C 12.97 8.04e-ll 210-228 
PROOOl 1A 14.05 8.34e-l 1 724-742 
PROOOllA 14.05 8.62e-ll 160-178 
PROOOllB 13.08 9.03e-ll 246-264 
PROOOllA 14.05 1.40e-10 810-828 
PROOOllB 13.08 1.53e-10 724-742 
PROOOllA 14.05 1.93e-10 507-525 
PROOOllD 12. 12 2.25e-10 246-264 
PROOOllB 13.08 2.59e-10 507-525 
PROOOl 1 A 14.05 4.04e-l0 464-482 


1247 


IPB001774 


Delta senate ligand 


IPB001774C 18.25 4.35e-l0 115-157 
IPB000561 4.89 4.75e-10 296-304 
PROOOl 1A 14.05 5.63e-l0 246-264 


1247 


IPB001886 


Laminin N-terminal (Domain VI) 


IPB001886E 10.90 7.17e-10 294-310 
PROOOllD 12.12 8.20e-10 681-699 
PROOOllB 13.08 1.25e-09 464-482 
IPB000561 4.89 l.64e-09 731-739 
PROOOllA 14.05 2.00e-09 332-350 
PROOOl 1A 14.05 2.75e-09 681-699 


1247 


PR00764 


Complement C9 signature VT 


PR00764F 15.74 3.96e-09 237-257 


1247 


IPB002174 


Furin-like cysteine rich region 


IPB002174A 30.51 4.60e-09 785-816 
PROOOllA 14.05 4.87e-09 593-611 


1247 


IPB002899 


EB module 


IPB002899A 6.67 6.32e-09 415-421 
IPB002899A 6.67 6.32e-09 761-767 


1247 


IPB002494 


"Keratin, high sulfur B2 protein" 


IPB002494A 12.44 6.32e-09 652-685 


19>T7 

i mi 


lroUUJ<So4 


Factor I membrane attack complex 


IPB003884F 16.26 7.27e-09 587-602 
IPB000034C 12.97 7.55e-09 296-314 
IPB001886E 10.90 7.83e-09 772-788 
IPB000561 4.89 8.71e-09 645-653 
IPB000561 4.89 8.71e-09 688-696 
PROOOllB 13.08 8.77e-09 681-699 
IPB000561 4.89 1.00e-08 253-261 


1249 


IPB002867 


Cysteine-rich domain (C6HC) 


IPB002867D 24.88 5.04e-18 129-160 


1249 


PR01475 


Parkin signature IX ~| 


PR014751 10.01 8.0le-09 86-108 


1254 


IPB002209 


HBGF (heparin binding growth 


IPB002209B 26.84 8.50e-31 90-128 
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factor)/FGF (fibroblast growth 
factor) family 


IPB002209C 23.35 1.00e-19 137-164 


1254 


PR00262 


IL1/HBGF family signature I 


PR00262A 25.25 4.38e-l 1 77-104 


1254 


PR00263 


Heparin binding growth factor family 
signature IV 


PR00263D 13.56 5.57e-l 1 106-125 
PR00263C 8.53 7.51e-10 90-102 
PR00262B 23.59 1.00e-08 108-128 


1258 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 7.48e-10 165-202 


1260 


IPB000956 


Stathmin 


IPB000956B 9.49 7.36e-ll 208-241 


1260 


PR00345 


Stathmin family signature II 


PR00345B 6.89 9.15e-ll 207-235 


1260 


IPB000533 


Tropomyosin 


IPB000533C 10.81 3.06e-09 113-154 


1261 


IPB000215 


Serpins 


IPB000215D 15.35 5.03e-14 324-350 
IPB000215A 13.01 2.91e-12 49-72 
IPB000215C 13.90 5.00e-09 216-230 


1262 


PR01377 


Claudin-l signature I 


PR01377A 7.94 1.00e-16 22-33 


1263 


PR00328 


GTP-binding SARI protein signature 
I 


PR00328A 12.43 5.14e-12 27-50 
PR00328B 7.64 2.38e-ll 55-79 


1263 


IPB000251 


ADP-ribosylation factors family 


IPB000251A 23.98 9.70e-09 55-108 


1264 


IPB001919 


"Cellulose-binding domain, bacterial 
type" 


IPB001919B 14.22 2.97e-09 270-294 


1265 


PR00258 


Speract receptor signature II 


PR00258B 7.94 3.00e-16 493-504 
PR00258C 9.05 3.70e-14 62-72 
PR00258C 9.05 7.30e-14 508-518 
PR00258A 13.56 4.34e-13 474-490 
PR00258D 14.29 2.66e-12 93-107 
PR00258D 14.29 4.55e-l 2 539-553 
PR00258A 13.56 7.20e-ll 133-149 
PR00258D 14.29 4.53e-10 294-308 
PR00258A 13.56 6.22e-10 229-245 
PR00258C 9.05 4.83e-09 163-173 
PR00258E 14.06 5.72e-09 215-227 
PR00258E 14.06 7.20e-09 562-574 


1266 


PR00258 


Speract receptor signature II 


PR00258B 7.94 3.00e-16 493-504 
PR00258C 9.05 3.70e-14 62-72 
PR00258C 9.05 7.30e-14 508-518 
PR00258A 13.56 4.34e-13 474-490 
PR00258D 14.29 2.66e-12 93-107 
PR00258D 14.29 4.55e-l 2 539-553 
PR00258A 13.56 7.20e-ll 133-149 
PR00258D 14.29 4.53e-10 294-308 
PR00258A 13.56 6.22e-10 229-245 
PR00258C 9.05 4.83e-09 163-173 
PR00258E 14.06 5.72e-09 215-227 
PR00258E 14.06 7.20e-09 562-574 


1270 


PR01305 


Invasion protein B family signature 
IV 


PR01305D 7.82 6.19e-09 423-436 


1273 


IPB001245 


Tyrosine kinase catalytic domain 


IPB00 1245 A 22.45 1 .00e-27 207-247 


1 "ill 

Ml 5 




MAP kinase 


IPB003527C 14.70 2.94e-27 199-247 


1273 


IPB000961 


Protein kinase C-terminal domain 


IPB000961C 15.48 5.95e-22 214-248 
IPB003527D 21.53 2.80e-17 256-297 


1273 


DPB001772 


Kinase associated domain 1 


IPB001772C 20.66 3.29e-17 202-232 


1273 


IPB000095 


PAK-box /P21-Rho-binding 


IPB000095E 17.62 6.35e-17 215-260 


1273 


IPB000861 


PKN/rhophilin/rhotekin rho-binding 
repeat 


IPB000861F 16.50 9.81e-16 208-262 


1273 


IPB000959 


POLO box duplicated region 


IPB000959B 15.68 3.01e-14 191-231 


1273 


IPB000494 


"Epidermal growth-factor receptor 
(EGFR), L domain" 


IPB000494C 24.40 7.88e-14 201-247 
IPB001245B 21.68 6.19e-13 263-301 
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IPB003527G 17.26 3.20e-l0 360-397 
IPB000961D 21.23 5. 27e- 10 259-300 
IPB000961A 16.82 3.33e-09 102-136 


1273 


PR00109 


Tyrosine kinase catalytic domain 
signature II 


PR00109B 1 1.07 7.75e-09 214-232 


1275 


IPB001762 


Disintegrin 


IPB001762A 23.93 4.33e-23 458-498 


1275 


IPB002870 


Reprolysin family propeptide 


IPB002870B 24.73 3.54e-20 131-169 


1275 


PR00289 


Disintegrin signature I 


PR00289A 14.29 1.16e- 14 474-493 
IPB002870F 18.81 3.03e-14 402-426 
IPB002870E 11.90 2.46e-12 361-373 
IPB001762B 10.06 3.40e-12 505-515 
IPB001762A 23.93 9.20e-l 1 426-466 | 


1275 


IPB000130 


,f Neutral zinc metallopeptidases, 
zinc-binding region" 


IPB000130 5.86 1.56e-10 359-369 


1275 


PR00138 


Matrixin signature IV 


PR0O138D 14.57 2.54e-10 359-384 
IPB002870D 16.31 4.77e-10 327-342 


1275 


IPB001774 


Delta serrate ligand 


IPB001774C 18.25 5.31e-10 677-719 


1275 


PR00480 


Astacin family signature II 


PR00480B 14.35 5.57e-10 354-372 


1275 


PR00436 


Interleukan-8 signature I 


PR00436A 15.20 7.43e-10 5-28 


1275 


IPB001818 


Matrixin 


IPB001818D 14.91 1.72e-09 353-384 % 
PR00289B 11.74 3.80e-09 503-515 
IPB002870A 12.22 6.54e-09 85-101 


1275 


IPB003306 


WIF domain 


IPB003306E 25.51 7.40e-09 654-699 


1275 


PR01236 


Tumour necrosis factor beta 
(lymphotoxin-alpha) signature I 


PR01236A 4.92 7.49e-09 17-33 
IPB002870C 11.01 9.64e-09 295-305 


1277 


PR01415 


Ankyrin repeat signature I 


PR01415A 12.73 1.00e-12 341-353 
PR01415A 12.73 2.29e-ll 302-314 


1277 


PR01256 


Otxl transcription factor signature II 


PR01256B 5.92 4.44e-09 431-443 
PR01256B 5.92 9.39e-09 432-444 


1278 


PR00756 


Membrane alanyl dipeptidase (Ml) 
family signature IV 


PR00756D 10.78 7.75e- 18 412-427 
PR00756A 12.71 1.45e- 17 245-260 
PR00756B 15.53 2.04e-14 297-312 
PR00756E 10.37 5.68e-09 431-443 


1278 


IPB000130 


"Neutral zinc metallopeptidases, 
zinc-binding region" 


IPB000130 5.86 6.57e-09 412-422 


1278 


IPB002594 


Glycoside hydrolase family 12 


IPB002594A 4.24 1.00e-08 26-35 


1288 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 6.85e-13 252-266 


1288 


PR00019 


Leucine-rich repeat signature I 


PR00019A 11.72 5,64e-09 164-177 


1288 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 6.19e-09 348-385 
PR00019B 11.42 8.91e-09 112-125 


1290 


PR00019 


Leucine-rich repeat signature II 


PR00019B 11.42 4.18e-12 83-96 
PR00019A 11.72 1.00e-10 86-99 
PR00019A 11.72 1.67e-10 111-124 


1290 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 7.43e-10 131-145 


1290 


IPB000267 


Asparaginase/glutaminase family 


IPB000267A 12.78 7.67e-09 11-27 


1290 


PR01528 


EDG-4 lysophosphatidic acid 
receptor signature II 


PR01528B 3.89 8.48e-09 130-144 


1292 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 5.85e-09 195-232 


1293 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 5.85e-09 195-232 


1295 


IPB001073 


Complement Clq protein 


IPB001073B 20.88 6.35e-20 92-126 


1295 


PR00007 


Complement CIQ domain signature 
III 


PR00007C 16.13 5.93e-14 159-180 
PR00007B 15.63 1.66e-13 112-131 
IPB001073C 13.07 2.25e-13 159-178 



0 
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IPB001073D 7.60 6.40e-12 191-200 
IPB001073A 22.14 4.67e-l 1 32-66 
PR00007D 9.66 6.29e-10 189-199 
PR00007A 20.64 3.68e-09 86-112 


1295 


PR00513 


5-hydroxytryptamine IB receptor 
signature IV 


PR00513D 10.60 9.80e-09 50-67 


1296 


PR01481 


Neurotensin type 2 receptor signature 
VI 


PR01481F 11.66 8.46e-28 236-259 
PR01481E 6.05 7.87e-25 214-235 
PR01481C 15.05 1.00e-17 150-163 


1296 


PR01479 


Neurotensin receptor signature II 


PR01479B 12.40 2.43e-17 89-101 
PR01481A7.58 3.54e-16 1-13 
PR01481B 6.68 L45e-15 14-26 
PR01481D4.62 2.19e-15 164-175 
PR01479B 8.74 3.70e-15 305-315 
PR01479D 13.10 6.57e- 14 294-304 
PR01479A 8.89 1.00e-13 29-39 


1296 


PR00237 


Rhodopsin-like GPCR superfamily 
signature VI 


PR00237F 14.34 9.33e-13 269-293 
PR00237G 19.23 4.44e-12 314-340 


1296 


PR00665 


Oxytocin receptor signature IV 


PR00665D 10.30 1.32e-ll 108-124 
PR01479F 8.03 5.l9e-l 1 342-352 
PR00237A 9.81 7.33e-10 34-58 
PR00237D 9.76 7.43e-10 125-146 


1297 


IPBOOl 101 


Plectin repeat 


IPBOOl 101C 6.05 3.42e-35 894-946 


1297 


IPB001589 


Actinin-type actin-binding domain 


IPB001589C 16.73 1.78e-31 285-316 
IPB001589D 26.07 2.55e-27 340-383 
IPBOOl 101M 9.29 7.80e-27 1607-1657 
IPBOOl 101Z 7.76 2.12e-25 3013-3066 
IPB001101B 12.20 1.00e-24 791-844 
IPB00110IF 10.86 3.20e-22 1078-1126 
IPBOOl 101E 6.00 7 


1297 


IPB002017 


Spectrin repeat 


IPB002017A 14.19 3.25e-l I 246-262 
IPBOOIIOIQ 7.28 8.69e-ll 2855-2892 
IPBOOIIOIS 8.38 9.52e-ll 2695-2738 
IPBOOUOIN 4.86 2.32e-10 1779-1833 
IPBOOl 101N 4.86 3.81e-10 1758-1812 
IPBOOUOIN 4.86 3.87e-10 1737-1791 
IPBOOl 101R 5.90 3.91e-10 31 12-3165 
IPBOOl 101T 7.36 5.01e-10 2720-2774 
IPBOOl 101 W 10.36 5.46e-10 3033-3062 
IPBOOl 101T 7.36 5.53e-10 3067-3121 
IPBOOl 101R 5.90 2.07e-09 2727-2780 


1297 


IPB000237 


GRIP domain 


IPB000237B 30.66 2.76e-09 2392-2442 
IPBOOIIOIQ 7.28 3.27e-09 3166-3203 


ily / 


IPB001664 


Intermediate filament proteins 


IPB001664B 17.44 5.92e-09 1742-1781 
IPBOOl 101O 8.21 6.25e-09 1767-1800 


1297 


IPB002079 


"Gag polyprotein, inner coat protein 
p!2" 


IPB002079J 10.53 6,85e-09 1766-1794 


1297 


IPR00171S 


v^dipomn nomoiogy vv^rij oomam 


IPB001715A 10.74 7.00e-09 241-251 
IPBOOl 10 1W 10.36 7.63e-09 2798-2827 
IPB001589E 11.55 8.94e-09 389-398 


1297 


IPB003865 


Prolyl 4-hydroxylase alpha subunit 
C-terminus 


IPB003865A 20.35 9.33e-09 2093-2137 
IPBOOl 101X 9.00 9.86e-09 3063-3096 


1298 


IPBOOl 101 


Plectin repeat 


IPBOOl 101C 6.05 3,42e-35 906-958 


1298 


IPB001589 


Actinin-type actin-binding domain 


IPBOOl 589C 16.73 1.78e-31 297-328 
IPB001589D 26.07 2.55e-27 352-395 
IPB001101M 9.29 7.80e-27 1619-1669 , 
IPB001101Z 7.76 2.12e-25 3025-3078 



WO 2004/080148 



PCT7US2003/030720 



425 
TABLE 3B 









IPBOOl 101B 12.20 1.00e-24 803-856 
IPB00H01F 10.86 3.20e-22 1090-1138 
IPBOOl 101E 6.00 7 


1298 


IPB002017 


Spectrin repeat 


IPB002017A 14.19 3.25e-ll 246-262 
IPBOOIIOIQ 7.28 8.69e-ll 2867-2904 
IPBOOIIOIS 8.38 9.52e-ll 2707-2750 
IPBOOl 101N 4.86 2.32e- 10 1791-1845 
IPBOOl 101N 4.86 3.81e-10 1770-1824 
IPBOOl 101N 4.86 3.87e-10 1749-1803 
IPBOOl 101R 5.90 3:91e-10 3124-3177 
IPBOOIIOIT 7.36 5.01e-10 2732-2786 
IPBOOl 101W 10.36 5.46e-10 3045-3074 
IPBOOIIOIT 7.36 5.53e-10 3079-3133 
IPBOOUOIR 5.90 2.07e-09 2739-2792 


1298 


IPB000237 


GRIP domain 


IPB000237B 30.66 2.76e-09 2404-2454 
IPBOOIIOIQ 7.28 3.27e-09 3178-3215 


1298 


IPB001664 


Intermediate filament proteins 


IPB001664B 17.44 5.92e-09 1754-1793 
IPBOOl 101O 8.21 6.25e-09 1779-1812 




LrB002u79 


"Gag polyprotein, inner coat protein 
pl2" 


IPB002079J 10.53 6.85e-09 1778-1806 


1 OQ8 


lrBUU1715 


Calponin homology (CH) domain 


IPB001715A 10.74 7.00e-09 241-251 
IPBOOl 101 W 10.36 7.63e-09 2810-2839 
IPB001589E 11.55 8.94e-09 401-410 


1298 


IPB003865 


Prolyl 4-hydroxylase alpha subunit 
C-terminus 


IPB003865A 20.35 9.33e-09 2105-2149 
IPBOOl 101X 9.00 9.86e-09 3075-3108 


1306 


1PB000998 


MAM domain 


IPB000998C 18.63 9.65e-l5 510-525 
IPB000998D 18.66 2.41e-14 575-598 
IPB000998B 17.20 4.55e-10 430-442 


1306 


PR00020 


MAM domain signature I 


PR00020A 20.48 7.62e-10 428-446 
PR00020C 12.01 4.78e-09 509-520 


1 1 AO 

1308 


IPBOOl 552 


Acyl-CoA dehydrogenase 
— — 


IPB001552E 22.77 2.46e-19 726-766 
IPB001552D 24.88 5.35e-19 635-677 
IPB001552C 25.04 7.75e-15 581-621 
IPB001552B 18.05 3. 19e-12 530-552 
IPBOOl 552A 11.25 6.90e-10 503-514 






Acyl-CoA dehydrogenase 


IPBOOl 552E 22.77 2,46e-19 708-748 
IPB001552D 24.88 5.35e-19 617-659 
IrDUUljjZC ZD.04 /. /5e-15 563-603 
IPB001552B 18.05 3. 19e- 12 512-534 
IPB001552A 11.25 6. 90e- 10 485-496 


1310 


IPB002524 


Cation efflux family 


IPB002524B 23.89 5.20e-17 86-125 


1310 


IPB003452 


Stem cell factor 


IPB003452B 19.11 6.63e-09 145-193 


1311 


PR00215 


Neuromodulin signature III 


PR00215C 13.82 7.58e-10 743-763 


1311 


PR00194 


Tropomyosin signature IV 


PR00194D 9.54 7.19e-09 622-645 


1311 


IPB001422 


Neuromodulin (GAP-43) 


IPB001422A 13.23 7.43e-09 718-762 


1314 


IPB0005 69 


HECT domain (Ubiquitin-protein 
ligase) 


IPB000569C 20.19 8.94e-30 2270-2299 


1314 


IlD\J\J\J i 3D 


riign moDinry group proteins riiVRjl 
and HMG2 


IPB000135D 2.13 9.00e-17 361-385 
IPB000135D 2.13 7.04e-16 370-394 
IPB000135D 2.13 3.70e- 15 360-384 
IPB000135D 2.13 5.50e-15 364-388 
IPB000135D 2.13 7.43e-15 367-391 
IPB000135D 2.13 7.94e-15 365-389 
IPB000569A 16.82 8.58e-15 2 


1314 


IPB001580 


Calreticulin family 


IPB001580F 2.93 5.50e-10 370-379 


1314 


IPBOOl 990 


Granins (chromogranin or 
secretogranin) 


IPB001990C 33.59 6.26e-10 352-399 
IPB001580F 2.93 7.75e-10 369-378 



WO 2004/080148 



PCT/US2003/030720 



426 
TABLE 3B 









IPB000135D 2.13 8.34e-10 351-375 
IPB000569B 18.58 8.92e- 10 2233-2249 


1314 


IPB003403 


Herpesvirus immediate early protein 


IPB003403E 17.25 8.97e-10 359-386 


1314 


IPB002889 


WSC domain 


IPB002889B 11.76 2.88e-09 1392-1438 
IPB000135D 2.13 4.09e-09 381-405 
IPB000135D 2.13 4.18e-09 352-376 
IPB000135D 2.13 4.36e-09 353-377 
IPB002889B 11.76 4.66e-09 1440-1486 


1314 

• 


IPB002000 


Lysosome-associated membrane 
glycoprotein (Lamp) 


IPB002000D 5.87 6.26e-09 1429-1442 
IPB000135D 2.13 6.27e-09 349-373 
IPB001580F 2.93 6.40e-09 374-383 
IPB000135D 2.13 6.45e-09 382-406 
IPB002889B 11.76 6.81e-09 1458-1504 
IPB002000D 5.87 7.11e-09 1434-1447 
IPB002889B 11.76 7.47e-09 1417-1463 
IPB001990C 33.59 7.51e-09 347-394 
IPB000135D 2.13 8.36e-09 350-374 
IPB002889B 1 1.76 9.53e-09 1402-1448 


1314 


IPB000637 


HMG-I and HMG-Y DNA-binding 
domain (A+T-hook) 


IPB000637B 14.21 9.73e-09 369-387 


1314 


PR01073 


Presenilin l signature III 


PR01073C 1.45 9.89e-09 367-378 


1317 


PR01145 


Thyrotropin receptor precursor 
signature I 


PR01145A 6.74 9.10e-l 13-22 


1317 


PRO 1472 


Intercellular adhesion 
molecule/vascular ceil adhesion 
molecule- 1 signature I 


PR01472A 16.78 7.66e-09 35-51 


1321 


PR00019 


Leucine-rich repeat signature II 


PR00019B 1 1.42 7.88e-12 335-348 
PR00019B 11.42 l.33e-10 477-490 
PR00019A 1 1/72 4.00e-10 480-493 
PR00019A 1 1.72 4.33e-10 338-351 


1321 


IPB001580 


Calreticulin family 


IPB001580F 2.93 4.94e-10 648-657 
IPB001580F 2.93 4.94e-10 649-658 
IPB001580F 2.93 4.94e-10 650-659 
PR00019B 11.42 5.33e-10 167-180 
PR00019A 1 1.72 4.00e-09 454-467 


1321 


IPB000I35 


High mobility group proteins HMG1 
and HMG2 


IPB000I35D 2. 13 4.64e-09 637-661 
PR00019B 11.42 7.55e-09 193-206 
PR00019B 11.42 7.55e-09 309-322 
PR00019B 11.42 7.82e-09 45 1-464 
IPB000135D 2.13 8.55e-09 635-659 


1322 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 9.14e-12 297-334 


1322 


IPB001000 


Glycoside hydrolase family 10 


IPB001000H 10.38 7.80e-09 8-21 


1323 


IPB001000 


Glycoside hydrolase family 10 


IPB001000H 10.38 7.80e-09 8-21 


1324 


IPB003884 


Factor I membrane attack complex 


IPB003884A 12.20 7.06e-09 34-45 


1328 


PR00258 


Speract receptor signature II 


PR00258B 7.94 5.00e-16 654-665 
PR00258B 7.94 6.50e-16 30-41 
PR00258B 7.94 6.50e-16 204-215 
PR00258A 13.56 9.70e-14 635-651 
PR00258B 7.94 2.58e-13 316-327 
PR00258E 14.06 4.16e-13 491-503 
PR00258A 13.56 5.63e- 13 402-418 
PR00258A 13.56 6.14e-13 185-201 
PR00258B 7.94 6.62e-13 421-432 
PR00258C 9.05 9.18e-13 45-55 
PR00258A 13.56 1.22e-12 1 1-27 
PR00258A 13.56 1.22e- 12 297-313 
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PR00258E 14.06 1.98e-12 99-111 
PR00258E 14.06 9.22e-l2 273-285 
PR00258D 14.29 2.00e-ll 468-482 
PR00258D 14.29 3.20e-l 1 700-714 
PR00258D 14.29 2.76e-10 250-264 
PR00258C 9.05 4.95e-10 219-229 
PR00258C 9.05 4.95e-10 331-341 
PR00258E 14.06 5.42e-10 385-397 
PR00258D 14.29 8.06e-10 362-376 
PR00258C 9.05 7.51e-09 436-446 


1333 


IPB000970 


"Developmental signaling protein, 
Wnt-1 family" 


IPB000970E 22.74 1.00e-40 202-255 
IPB000970F 23.43 1.5 le-40 307-355 
IPB000970C 13.22 2.80e-25 101-132 
IPB000970B 14.73 6.14e-23 65-88 


1333 


PR01349 


Wnt protein signature IV 


PR01349D 8.90 3.8Ie-20 222-237 
IPB000970D 13,85 3.48e-17 167-186 
PR01349C 10.34 3.86e-15 167-179 
PR01349A 11.18 8.55e-14 103-117 
PR01349B 10.00 3.32e-12 122-135 
PR01349E 12.39 5.61e-ll 283-294 


1333 


IPB001073 


Complement Clq protein 


IPB001073A 22.14 4.20e-10 137-171 
IPB000970A 13.08 5.78e-10 41-56 


1335 


PR00245 


Olfactory receptor signature I 


PR00245A 10.98 8.92e-l 1 59-70 


1335 


PR00534 


Melanocortin receptor family 
signature I 


PR00534A 12.77 3.61e-09 18-30 


1337 


IPB001522 


"Fatty acid desaturase, type 1" 


IPB001522D 12.81 1.00e-40 119-154 
IPB001522F 22.32 i.00e-40 241-295 
IPB001522E 20.55 5.85e-36 163-216 
IPB001522C 14.10 2.89e-33 81-117 


1337 


PR00075 


Fatty acid desaturase family 1 
signature IV 


PR00075D 13.27 3.57e-33 131-160 
PR00075C 10.51 3.40e-22 94-114 
PR00075G 10.50 6.62e-20 268-282 
PR00075E 11.60 6.46e-18 192-210 
PR00075A 16.73 9.44e-17 47-67 
PR00075F 14.62 8. 8 le- 16 225-246 
PR00075B 13.44 4.56e-14 71-93 
IPB001522B 29.55 6.82e-12 29-80 


1339 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D 2.13 2.57e-17 46-70 
IPB000135D 2.13 9.86e-17 43-67 
IPB000135D 2.13 6.10e-16 45-69 
IPB000135D 2.13 1.77e-15 47-71 
IPB000135D 2.13 2.93e-15 44-68 
IPB000135D2.13 3.83e-15 41-65 
IPB000135D 2.13 2.95e-14 48-72 
IPB000135D 2.13 7.93e-14 42-66 
IPB000135D 2.13 7.81e-13 49-73 


1339 


IPB001422 


Neuromodulin (GAP-43) 


IPB001422C 16.82 3.41e-ll 40-75 
IPB000135D 2.13 9.08e-ll 40-64 
IPB000135D 2.13 9.69e-ll 50-74 


1339 


IPB001580 


Calreticulin family 


IPB001580F 2.93 1.00e-10 50-59 
IPB000135D 2.13 2.17e-10 51-75 
IPB000135D 2.13 3.15e-10 39-63 
IPB001580F 2.93 4.94e-10 57-66 
IPB001580F 2.93 4.94e-10 58-67 
IPB001580F 2.93 5.50e-10 56-65 
IPB001580F 2.93 6.06e-10 54-63 
IPB001580F 2.93 7.75e-10 49-58 
IPB001422C 16.82 7.99e-10 43-78 
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IPB001422C 16.82 8.58e-10 42-77 
IPB000135D 2.13 8.63e-10 38-62 

TnnAA 1 COAT? O A*J O 00„ 1 A C 1 /TA 

IPB001580F 2.93 8.88e-10 5 I -60 
IPB001422C 16.82 9.05e-10 46-81 

TT1DAA1 f OATJ 1 n*2 O A An 1 A CA /CO 

IPBOQlooUr 2.93 9.44e-10 59-68 
IPB001422C 16.82 5.61e-09 48-83 
IPB000135D 2.13 6.27e-09 37-61 
lriiUU14Z/o 10. oZ o.*tl/e-U!7 *f*H/y 

IPBO0158OF 2.93 6.40e-09 52-61 
TPpfiniAoor* i£ ro r oo<» no A7-R9 


1339 


IPB000637 


HMG-I and HMG-Y DNA-binding 
domain (A+T-nook) 


IPB000637B 14.21 l.00e-08 45-63 

Tppnni^flnp oQi i no,f» or £i ic\ 
lrjouuijour z.so i.uue-uo oi-/u 


1340 


IPB004000 


Actin and actin-like 


IPB004000C 8.664.86e-20 137-191 
lJrJb>UU*tUUUlJ Ij.jo d. /ue-io ZO /-JZl 


1340 


PR00190 


Actin signature VI 


PR00190F 7.36 2.20e-14 135-154 

TTJtJAA/lflAA A O Q7 A £Aa 11 ^ A1 
lrDUU4UUUA y.y 1 H.04e-lj D-^O 

IPB004000B 6.57 5.80e-12 83-133 


1341 


PR01333 


Two pore domain K+ channel 
signature I 


T>1? A1 W A 1 R OA A AAa 1 R 1 7^ 1 


1341 


PR01463 


EAG/ELK/ERG potassium channel 
family signature VI 


PR01463F 4.09 1.95e-12 243-260 \ 
rKuijJJij lu.jy y. / ie- iu zjd-zo^- 


1341 


PR01526 


EDG-6 sphingosine 1-phosphate 
receptor signature IV 


PR01526D 5.56 9.71e-09 1-16 


1343 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.20e-10 348-385 


1344 


IPB000998 


MAM domain 


IPB000998C 18.63 l.95e-12 833-848 
rpnnnnooRR 17 7A i 67p-.11 761-771 


1344 


PR00020 


MAM domain signature I 


PR00020A 20.48 3.62e-ll 759-777 
ppaaaoap 17 ni r 17p_ia R17-R41 

rKUUUZUU 1Z.UI o. IZe-lU OjZ-OH-3 

IPB000998D 18.66 9.61e-10 898-921 


1344 


rnnAni f\r\ f 

IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


tpraaiaaaa 17 ^i 7 1 1<» no i ^4-1 76 


1344 


r»r> aaaa/c 

PR00096 


Glutamine amidotransferase 
superfamily signature III 


ppaaaq^p i ^ rs o ?Rp-oq S14-547 


1345 


IPB002350 


Kazal-type serine protease inhibitor 
family 


IPB002350 31.78 3.92e-13 127-167 


1345 


IPB000867 


Insulin-like growth factor-binding 
protein 


IPB000867B 11.44 1.37e-12 75-91 


1345 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 3.88e-10 23 1-268 | 


1345 


IrBUUZJZo 


Zinc-containing alcohol 
dehydrogenase 




1 346 


IPB000224 


Vesiculovirus phosphoprotein 


TPRnnft99AA 7 0/;/; 74p-1fl 417-470 


1 346 


IPB000135 


High mobility group proteins HMG1 
ana riiVLOZ 


IPRAAA115n9 117 16^-10 410-4^4 


1346 


PR00449 


Transforming protein P21 ras 
signature I 


PR00449A 12.48 8.16e-10 83-104 


\1A& 
1 JHO 




frTPl /CYRfr fTTP-hinHinfr nrntptn 

family signature I 


PR00326A 8 70 9 13e-10 85-105 
IPB000135D 2.13 3.09e-09 434-458 


1346 


IPB000619 


Guanylate kinase 


IPB000619A 18.08 4.2le-09 85-102 


1346 


PR00905 


Hypothetical mycoplasma lipoprotein 
(MG045) signature VIII 


PR00905H 6.88 5.89e-09 343-363 


1346 


PR00364 


Disease resistance protein signature I 


PR00364A 8.29 7.l4e-09 84-99 


1346 


PR00094 


Adenylate kinase signature I 


PR00094A 9.62 9.57e-09 86-99 


1346 


PR00918 


Calicivirus non-structural polyprotein 
family signature I 


PR00918A 13.81 9.69e-09 79-99 


1346 


IPB000795 


GTP-binding elongation factor 


IPB000795A 10.67 9.77e-09 84-99 
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1PB00U135D 2.13 9.82e-09 429-453 


1 1 A O 

1348 


PR00406 


Cytochrome B5 reductase signature 
VI 


PR00406F 4.29 4.86e-l 1 158-166 


1348 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 6.48e-U 473-510 
IPB003006B 20.23 4.60e-10 291-328 
IPB003006B 20.23 3.08e-09 190-227 


1348 


PR00014 


Fibronechn type III repeat signature 
IV 


PR00014D 15.12 8.83e-09 891-905 


1349 


Art /''An 

PR00698 


C-elegans Srg family integral 
membrane protein signature V 


PR00698E 14.65 2.76e-09 97-122 


1349 


IPB002146 


ATP synthase B/B' CF(0) 


IPB002146 21.39 6.94e-09 174-212 


1350 


EPB000215 


Serpins 


IPB000215D 15.35 1.41e-18 311-337 
IPB000215A 13.01 8.29e-18 72-95 
IPB000215C 13.90 L53e-15 207-221 

TAT5AAA'"> 1 CD 1 C *> £ *7 r\ r\ _ 1 1 *">*70 vtrvi 

IPB000215E 15.36 7.00e-13 378-402 
IPB000215B 9.87 4.68e-ll 180-192 


1352 


1FBU0I737 


Kibosomal KJNA adenine 
dimethylase 


TTVDAA 1 T1 T A A*7 11 O C A ^ 1 A- 1 O >) 1 TO 

IrBUU1737A 27.11 8.54e-10 134-179 




1PB000906 


ZU5 domain 


IPBUUU906Cj 25.85 6.28e-10 164-212 
IPB000906A 22.49 3.16e-09 58-100 


1356 


IPBQ01245 


Tyrosine kinase catalytic domain 


TTir>/"\ a 1 o A era o 1 zr c a ~ 10 00c /i ti 

IPB001245B 21.68 6.54e-13 385-423 


1356 


IPB000095 


PAK-box /P21-Rho-binding 


IPB000095F 16.47 3.97e-ll 389-443 


1356 


IPB000961 


Protein kinase C-terminal domain 


IPB000961D 21.23 2.22e-10 381-422 
IPB001245A 22.45 3.18e-l0 332-372 


1356 


IPB001359 


Synapsin 


IPB00I359H 22.58 7.l2e-l0 696-1 Ad 
IPB001359H 22.58 4.84e-09 695-745 


1356 


IPB002889 


WSC domain 


IPB002889B 11.76 6.8le-09 1510-1556 
IPB002889B 11.76 9.25e-09 1491-1537 


1357 


IPB001359 


Synapsin 


IPB001359H 22.58 7.l2e-l0 289-339 
IPB001359H 22.58 4.84e-09 288-338 


1357 


IPB002889 


WSC domain 


IPB002889B 11.76 6.81e-09 1103-1149 
IPB002889B 11.76 9.25e-09 1084-H30 


1358 


PR00237 


Rhodopsuvhke GPCR superfamily 
signature VII 


PR00237G 19.23 9.64e-15 41-67 


1358 


IPB000276 


Rhodopsin-Iike GPCR superfamily 


IPB000276D 9.40 5.05e- 12 51-67 
IPB000276C 8.03 8.50c- 1 1 8-19 


1359 


PRO 1041 


Metnionyl-tRNA synthetase 
signature V 


PRO 1 04 IE 1 6.72 2.69e-l7 306-321 
PR01041D 11.02 7.43e-13 276-287 

r>T)A1A/l1 A 11 A C\ O /TO„ 11 /IT /CA 

PRO 104 1 A 11.40 o.ooe- 13 47-60 


1359 


IPB001412 


Aminoacyl-transfer RNA synthetases 
class-I 


IPB001412B 6.33 8.7le-12 344-354 

nnAi C\ A ID 1 i CC\ A f\£l ~ AA OA C\tL 

PR01041B 1 1.59 4.06e-09 82-96 


1 1 C A 

1359 




Arginyl-tRNA synthetase signature II 


DDAI ATOD A 1 A *7 ZTO „ AA CA 

PR01038B 9.12 7.68e-09 59-75 


1360 


IPB000353 


"Class II histocompatibility antigen, 
beta chain, beta-1 domain" 


IPB000353A 18.51 7.30e-27 42-91 


1363 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.83e-ll 374-411 


1365 


PR01360 


Interleukin-1 receptor antagonist 
precursor IL-1RA signature VI 


PR01360F 14.44 9.86e-18 116-134 


1365 


PR00264 


Interleukin-1 precursor family 
signature III 


PR00264C 19.37 4.90e-16 108-123 
PR01360E 9.69 9.33e-13 95-115 


1365 


EPB000975 


Interleukin-1 


IPB000975E 28.12 3.57e-12 95-134 


1365 


PR01357 


Interleukin-1 alphafteta precursor 
family signature VI 


PR01357F 17.87 7.15e-10 108-123 
PR00264A 18.63 9.85e-09 55-75 


1366 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599L18.66 4.15e-28 1224-1251 
IPB001599F 18.95 7.00e-24 786-815 
IPB001599H 18.42 6.40e-20 999-1026 
IPB001599N 24.85 7.69e-20 1417-1449 
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IPB001599A 10.97 9.69e-18 123-141 


1366 


IPB001134 


"Netrin, C-terminus H 


IPB001134C 17.82 4.13e-13 1237-12M 

IPB001599M 13.29 4.71e-13 1364-1375 

IPB001599G 13.87 8.94e-13 967-976 

IPB001599B 7.45 4.89e-12 209-221 

IPB001599D 11.61 6.90e-12 729-739 
roD/im coot in oo i nn a 11 i a^c iaqa 

IPB0015991 10.83 7.60e-ll 1034-1043 
ttxdaakgov s k i a/%*» ia iig/i_i9A^ 

IPB001599C 14.40 3.55e-09 236-252 

TDDAA1 ^QQT? 1 1 A£ Q llt> AQ I'sfsJlfi^ 
JLroUU ijyyEi i l.UO y. / /Q-\)y /JO-/OJ 


1368 


IPB001526 


Ly-6/u-PAR domain 


IPB001526C 13.04 7.55e-15 90-105 

TDHAA1*9AA 1 1 94 Q 1 At*-\ 1 19-97 

IPB001526B 12.26 7.75e-10 46-55 


1967 


IPB001400 


Somatotropin hormone family 


TDDAA1/IAAH 91 A9 1 QAp. 9S QQ 11^ 

IPB001400A 14.85 4.9ie-16 55-78 


1967 


PR00836 


Somatotropin hormone family 
signature II 


CD (\(\Q1£Xl 1 9 CA 9 AAt* 1 A 1 9 1 1 1Q 
rKUUttjuO 1 /.J\J Z.44e-14 lZI.-l.jy 

PR00836A 15.53 2.35e-13 99-112 


1968 


IPB001400 


Somatotropin hormone family 


TDDAA1/1AAX3 91 £S) 1 OAp> 952 QO 11^ 
IrDUUlHUUxJ Zj.OZ l.yue-Zo yy-ijj 

IPB001400A 14.85 4.91e-16 55-78 


1968 • 


PR00836 


Somatotropin hormone family 
signature II 


I>DAAQ1AT3 19 *\A 9 AAt* \A 191 1 1Q 

PR00836A 15.53 2.35e-13 99-112 


1969 


IPB001400 


Somatotropin hormone family 


IPB001400B 23.62 1.90e-28 99-135 
IPB001400A 14.85 4.91e-16 55-78 ! 


1969 


PR00836 


Somatotropin hormone family 
signature II 


PR00836B 17.50 2.44e-14 121-139 

DDOOCK A 1 C *J1 9 ICo 1 1 QO 119 


1970 


IPB001400 


Somatotropin hormone family 


IDDAA1/1AAD 91 /£9 1 QA#» 98 QQ.n^ 

TDQAAMAAA IA 8^ 4 
lrr>UU14UliA 14.0.J 4.yie-io jj"/o 


1970 


PR00836 


Somatotropin hormone family 
signature II 


PR00836B 17.50 2.44e-14 121-139 
PR00836A 15.53 2.35e-13 99-112 


1971 


IPB000215 


Serpins 


IPB000215E 15.36 5.76e-17 425-449 

Tt>DAAA91<A 11 A1 14% 1 •*» 111-1^4 

IrDUUUZljA Ij.Ui j.*fZe-lJ lil-lOt 

IPB000215D 15.35 8.05e-ll 346-372 
IPB000215C 13.90 1.29e- 10 241-255 

TPRAAA91 5tt Q R7 f% A4p-1ft 914-776 


1972 


PR00390 


Phospholipase C signature I 


PR00390A 14.24 6.34e-20 2-20 


1973 


IPB000734 


Lipase 


IPB000734 10.25 8.50e-09 468-482 


1977 


IPB000689 


UbiH/COQ6 monooxygenase family 


inDAAA/conrv OO A*7 9 Ola 1Q 177-A97 

IPB000689B 27.03 9.59e-28 217-251 

TDDAAA/CQQ/^ 1 Q 9£ 1 9/l*» 9A. 9A9-9Rfi 
lrDUUUOoyC 18. /0 J. /4C-Z4 ZOZ-ZOO 

IPB000689A9.il 1.25e-ll 52-64 


1977 


PR00420 


Aromatic-ring hydroxylase 
(flavoprotein monooxygenase) 
signature III 


nnnn/iinr 1 in aa Q n Q 1 1 191 ICS 
FKUu4zUO lz.44 o.Die-ll J/j-ooo 


1977 


PR01001 


FAD-dependent glycerol-3- 
phosphate dehydrogenase family 
signature I 


PR01001A 8.45 1.60e-09 51-63 

T>"T> A A yl *^ A A 1 C A*7 1 AC a AO <9 9/1 

PR00420A 15.97 i.yje-UV 3Z-/4 
PR00420B 13.97 8.53e-09 215-230 




TDQAAA1/1^ 


v^yiocnroine c iairuiy iiciiic-umuuig 
site 


IPB000345 9 03 7 19e-09 153-165 


1982 


IPB002610 


Rhomboid family 


IPB002610C 5.81 3.81e-10 262-272 
IPB002610B 5.33 6.81e-09 203-213 


1984 


IPB001124 


Lipid-binding serum glycoprotein 


IPB001124D 21.85 2.50e-12 251-287 
IPB001124C 25.71 5.08e-ll 184-227 


1985 


IPB000817 


Prion protein 


IPB000817A 8.34 6.40e-09 70-112 
IPB000817A 8.34 8.67e-09 64-106 


1988 


IPB001442 


C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442F 15.05 1.00e-40 585-628 
IPB001442C 14.98 4.82e-40 498-532 
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IPB001442A 26.12 4.09e-39 259-3 1 1 
IPB001442D 15.34 1.00e-34 533-564 


1988 


IPB000885 


Fibrillar collagen C-terminal domain 


IPB000885B 19.15 1.93e-27 300-353 
IPB001442A 26.12 8.93e-27 103-155 
IPB001442A 26.12 9.69e-27 106-158 
IPB001442A 26.12 4.19e-26 368-420 

TDDArtAOOC A 1 1 A£ A OA — OZT A AA 

1PB000885A 11.46 4.80e-2o 363-400 
IPB001442A 26.12 6.52e-26 112-164 
IPB001442A 26.12 9.71 


1988 


IPB001073 


Complement Clq protein 


IPB001073A 22.14 9.18e-19 374-408 
IPBO0O885B 19.15 9.40e-19 309-362 
IPB000885B 19.15 9.40e-19 373-426 
IPB001442A 26.12 9.42e-19 265-317 

TT1T5AA1 A Af> A 1 11 A T7« 1A m IOC 

IPB001442A 26.12 9.77e-19 133-185 
IPB000885B 19.15 1.12e-18 81-134 
IPB001442A 26.12 1.33e 


1988 


IPB001285 


Synaptophysin/synaptoporin 


IPB001285F 6.39 4.08e-09 340-384 
IPB000885B 19.15 4.11e-09 48-101 
IPB000885B 19.15 4.35e-09 174-227 
IPB001442B 12.38 4.4ie-09 257-277 ! 
IPB001442B 12.38 4.41e-09 417-437 
IPB000885B 19.15 4.68e-09 147-200 
IPB000885B 19.15 4.68e- 


1988 


IPB000817 


Prion protein 


IPB000817A 8.34 7.73e-09 258-300 
IPB001073A 22.14 7.75e-09 76-110 
IPB001442B 12.38 7.81e-09 25-45 
IPB001073A 22.14 7.89e-09 151-185 
IPB001073A 22.14 8.31e-09 416-450 
IPB000817A 8.34 8.39e-09 255-297 
IPB001442B 12.38 8.42e-09 363-383 
IPB001442A 26.12 8.59e-09 160-212 
IPB001442A 26.12 8.90e-09 40-92 
IPB001442B 12.38 8.91e-09 429-449 
IPB000885B 19.15 8.94e-09 324-377 

TDD A A 1 ATI A 1 A A 1Ao AO QO 1 1 A 

lr dUU 1 U 15 A 22, 14 yoUe-Uy 82- 1 1 0 
IPB001073A 22.14 9.30e-09 307-341 
IPB001442B 12.38 9.64e-09 323-343 
IPB001073A22.14 9.72e-09 148-182 
IPB000885B 19.15 9. 84e-09 412-465 


1989 


IPB000033 


"Low-density lipoprotein (ldl) 
receptor, YWTD repeat" 


rDonnAAiin m i c i i Co 1 A \ 1 i i aq 

IPB000033D 30.18 6.25e-ll 67-105 
IPB000033C 11.58 6.40e-10 135-149 
IPB000033C 11.58 8.07e-09 48-62 
IPB000033C 11.58 8.07e-09 91-105 


1990 


IPB000033 


"Low-density lipoprotein (ldl) 
receptor, YWTD repeat" 


IPB000033D 30.18 1.18e-14 111-149 

mi** aaaa*> o n\ *>A Ifl / if. 1 1 H'n 1 AC 

IPB000033D 30.18 6.25 e- 11 67-105 
IPB000033C 11.58 6.40e-10 135-149 

TnnnArtmin 11 coo a*7»* aa a 0 /Tl 

IPB000033C 11.58 8.07e-09 48-62 
rPROOOO^^r* 1 1 58 8 07e-09 91-105 


1992 


PR00205 


Cadherin signature 11 


PR00205B 20.09 4.94e-14 114-143 
PR00205D 12.22 9.31e-l4 198-217 , 
PR00205F 19.57 1.53e-l2 167-193 
PR00205D 12.22 8.20e-l 2 93-112 
PR00205G 13.05 2.46e-ll 201-218 
PR00205G 13.05 3.93e-10 96-ll3 


1992 


IPB002126 


Cadherin domain 


IPB002126B 12.04 7.68e-10 102-119 
PR00205A 17.38 8.15e-09 160-179 


1993 


PR00205 


Cadherin signature II 


PR00205B 20.09 4.94e-14 114-143 
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PR00205D 12.22 9.31e-14 198-217 
PR00205F 19.57 1.53e-12 167-193 
PR00205D 12.22 8.20e- 12 93-112 
PR00205G 13.05 2.46e-ll 201-218 
PR00205G 13.05 3.93e-10 96-113 


1993 


EPB002126 


Cadherin domain 


IPB002126B 12.04 7.68e-10 102-119 
PR00205A 17.38 8.15e-09 160-179 


1994 


IPB002469 


"Dipeptidyl peptidase IV, N- 
terminus" 


IPB002469J 8.97 3.52e-12 17-33 


1995 


PR01534 


Vomeronasal type 1 receptor family 
signature V 


PR01534E 7.16 1.23e-09 5-19 


1996 


IPB000221 


Protamine PI 


IPB000221 5.48 2.97e-12 124-150 
IPB000221 5.48 9.30e-12 113-139 
IPB000221 5.48 2.19e-ll 153-179 
IPB000221 5.48 2.59e-ll 114-140 
IPB000221 5.48 3.91e-ll 128-154 


1996 


IPB000492 


Protamine 2 (PRM2) 
• 


IPB000492B5.26 5.88e-ll 148-182 
IPB000221 5.48 6.l6e-ll 142-168 
IPB00Q221 5.48 6.43e-ll 149-175 
IPB000221 5.48 7.62e-U 110-136 
IPB000492B5.26 9.35e-ll 129-163 
IPB000492B 5.26 9.35e-ll 152-186 
IPB000221 5.48 2.73e-10 168-194 
IPB000221 5.48 4.70e-10 112-138 
IPB000221 5.48 4.70e-10 144-170 
IPB000492B 5.26 6.97e-10 153-187 
IPB000492B 5.26 8.12e-10 156-190 
IPB000492B 5.26 8.53e-10 155-189 
IPB000221 5.48 8.89e-10 151-177 
IPB000492B 5.26 9.06e-10 128-162 
IPB000492B 5.26 9.69e-10 150-184 
IPB000221 5.48 1.00e-09 133-159 
IPB000221 5.48 1.46e-09 115-141 
IPB000221 5.48 3.31e-09 159-185 
IPB000221 5.48 3.31e-09 172-198 
IPB000492B 5.26 3.84e-09 125-159 
IPB000221 5.48 5.15e-09 157-183 
IPB000221 5.48 5.27e-09 102-128 


1996 


PR00055 


HIV TAT domain signature III 


PR00055C 9.12 5.92e-09 66-82 
IPB000221 5.48 6.19e-09 166-192 
IPB000492B 5.26 6.38e-09 144-178 
IPB000492B 5.26 6.67e-09 157-191 
IPB000221 5.48 6.88e-09 147-173 
IPB000221 5.48 6.88e-09 161-187 
IPB000492B 5.26 7.75e-09 127-161 
IPB000492B 5.26 8.34e-09 115-149 


1996 


IPB000271 


Ribosomal protein L34 


IPB000271 15.87 9.78e-09 161-198 
IPB000492B 5.26 9.90e-09 161-195 
IPB000221 5.48 1.00e-08 126-152 


1998 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 4.91e-ll 52-89 
IPB003006B 20.23 3.52e-10 155-192 
IPB003006B 20.23 l.69e-09 250-287 
IPB003006B 20.23 4.81e-09 437-474 


1998 


PR01536 


Interleukin-1 receptor type I and type 
II family signature III 


PR01536C 19.92 5.85e-09 59-82 


1999 


IPB000897 


GTP-binding signal recognition 
particle (SRP54) domain 


IPB000897A9.15 8.60e-ll 313-332 


2000 


IPB001140 


ABC transporter transmembrane 


IPB00 1140 A 21.73 2.00e-19 107-153 
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region 


IPB001140B 15.62 4.44e-10 222-260 


2000 


IPB000795 


GTP-binding elongation factor 


IPB000795A 10 67 7 88p-10 1?D-M^ 


2000 


PR00326 


GTP1/OBG GTP-binding protein 
family signature I 


PR00326A 8 70 4 4Qp-0Q 1 ? 1 141 


2000 


IPB000897 


GTP-bindine signal recognition 
particle (SRP54) domain 


IPB000897A 9 1 S s S7p-00 i ?n no 


2000 


IPB001324 


PhosDhoribulokinase familv 


IPB001324A 18 19 8 OOp-flO 117 llfi 


2001 


IPB001422 




IPR001499P 1£ £9 9^*» in 77Q fill 
ax j-»ui/i*tZrZrU» io.oz j.zoe-iu //o-oiJ 


2001 


PR01217 


Proline rich exlpn^in <?itxnahirp V7T 




2001 


IPB003134 


Repeat in HSl/Cortactin 


IPB003134F 15.66 7.29e-09 776-824 
PR01217D 4.57 7.49e-09 562-583 


2001 


IPB000996 


Clathrin light chain 


IPB000996B 20.25 7.82e-09 752-804 


2001 


IPB002079 


"Gag polyprotein, inner coat protein 

n19 M 
piZ 


IPB002079J 10.53 9.19e-09 779-807 




TPROflftl IS 
lrDv/uu ijj 


\-A A Ctrl m Anillftf <n*/Mir% nrAf^mn LJ A n (3 1 

xiign moDiiity group proteins xiivivjr i 

CUiU XliVlVJZr 


TDDOO0 1 1K A 11 /Zd a £L*s*. An *7^i on 

iroUUUlJDA li.oV y.o2e-09 763-817 


2001 


IPB001084 


Microtubule associated Tau protein 


IPB001084C 7.66 9.64e-09 375-392 


2001 


TPR001 101 


xici^Lin repeal 


lrDuui iuijv o.j j y.yze-uy yo-ijy 


2002 


IPB001552 


Acyl-CoA dehydrogenase 


IPB001552E 22.77 2.46e-19 523-563 
IPB001552D 24.88 5.35e-l 9 432-474 
IPB001552C 25.04 7.75e-15 378-418 

lrDUUljjZo Io.Uj j.4je-lz 1Z4-140 

IPB001552A 11.25 6.90e-10 97-108 


2003 


IPB000998 




LrDVVKjyyoLj lo. oo i.yoe-ij j4o-joy 


2003 


IPB003886 


Extracellular domain in nidogen 


IPB003886D 13.91 8.77e-15 253-272 


2003 


1PB000152 


Aspartic acid and asparagine 
nyuroAyidiion sue 


IPB000152 8.86 2.89e-14 126-141 


£AJ\JJ 


TPROfH 881 

liDUUlOO I 


^aicium-oinaing cur-iiice aomain 


lftJUUloolD lz.zo j. UUe- 14 zUo-z 19 
IrrJUUUijZ 0.0O l.UUe-U Zjj-zoo 
IPB000152 8.86 1.82e-13 208-223 

TPRDDI R5?1 R 19 98 4 7Sp 11 19A 177 

irDUulOOlD 1Z.ZO *t. / JC-1 j 1Z0-1J/ 


2003 


IPB001774 


Dplta <jpn*5>tp licrnnrl 
x/ciia, oviidic ugaiiu 


TPRnni774P Tfi 9^ Q 17^ 1 1 fifi 1 in 
IPB000998R 17 20 1 Ofip-19 49S-440 

IX UUUU770U 1 / .ZrVI l.UUC 1£ *T^O***T*T\/ 


2003 


PR00020 


MAM domain signature I 


PR00020A 20.48 2.88e-il 426-444 
IPB000998C 18.63 5.30e-l I 483-498 

TPR0018R1R 19 95?RSRp-11 9S7-9^4 


2003 


PR00907 


Thrombomodulin signature II 


PR00907B 11.50 2.44e-10 160-176 


2003 


IPB000561 


EGF-like domain 

JL/VJ V U Will dill 


TPRnn.0^61 4 RO 1 9Sp 1fl 07 10^ 


2003 


IPB000033 


"Low-density lipoprotein (Idl) 
recentor YWTD rpnpat" 


IPB000033B 7.05 5,35e-10 258-268 
TPRnnnnuR i ^ Q7<>_nQ 9ii 991 

u DUUUUjjd /.Uj J.f/C-Uy ZlJ-ZZJ 


2003 


IPB000167 

XX WW IV/ / 


DphvHrin 

U^ll JKXL 111 


TPRD0ni^7A 8 S8 7 14p HQ 7/tru^7 

irDWUlO/A O.JO /.1HC-V7 J^VrjO/ 


2003 


IPB003367 


Thrombospondin type 3 repeat 


IPB003367A 11.78 9.79e-09 175-195 


2004 


IPB0012S8 

IX DU v 1 At J O 


NHT rpnpaf- 


IJrDlfUlZjoD Zo.Ol *f.JUe-l/ IUZ-IjO 

TPR0019SSR 9R 7 nn<a_17 R-il9 
IxOvUIZJOO ZO.Ol /.UUc-l/ 0~*rZ 

IPB001258B 28.61 5.60e-ll 55-89 


2005 


IX UuUU 1 70 


RHnfJAP Hrwnain 
i\Jiusj/\x uurridin 


irDUuuii/oL> io.*fy o.j le- 10 yjz-yoy 
1PB000198B 12.47 9.10e-15 862-879 


2005 


IPB002219 


Phorbol esters/diacylglycerol binding 
domain 


IPB002219B 12.53 3.89e-ll 753-768 
IPB000198A 15.95 9.6le-10 810-826 


2005 


IPB002551 


Coronavirus SI glycoprotein 


IPB002551J 18.56 3.60e-09 499-540 


2005 


IPB001369 


Purine and other phosphorylases 
family 2 


IPB001369C 24.81 4.27e-09 65-105 


2005 


IPB003351 


Dishevelled specific domain 


IPB003351C 13.82 7.24e-09 1054-1093 


2007 


PR01303 


Plasmodium circumsporozoite 
_protein signature IV 


PR01303D 10.57 9.21e-10 5-22 


2008 


IPB003164 


Alpha adaptin carboxyl-terminal 
domain 


IPB003164L 9.84 1.00e-40 48-82 
IPB003164N 8.78 1.00e-40 184-222 
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1PJ3Q03164Q 13.71 i.00e-40 285-319 

TPt3 AAO 1 Q 10 Af\ 1 r\n _ a r\ ncn nr\A 

irDUUolO'fo U.4U l.UUe-40 353-394 

TPRnfiOl 6AX> 1A <A o ic« io *>on ncn 

irouu^lOH-K llOU z.35e-38 320-352 
IPB003164O 13.89 8.62e-35 223-255 
IPB003164P 12.26 7.65e-33 256-284 
IPB003164M 10.25 5.18e-31 107-138 
irr>uujio*n 4.ooe-zj jyj-414 


2013 


IPB001359 


Synapsin 


IPB001359H 22.58 2.75e-09 14-64 

TPROni ^^QT-T 99 ce o £in no Af\ on 
lrouuioji/ri zz.jo j.oze-uy 4U-yu 


2015 


PR00456 


IVlUUoUIIlal piULCin rz signature V 


PPnAAC/*.!? 0 AQ < *7 1 « no oo o<c 
rKwHOOii j.Uo j. / le-Uy zz-JO 


2016 


IPB00 1 1 14 


Rpnpat in HQ 1 /fYirfaffin 
ivcpcai 111 flj 1/ v^oriaciin 


TPt5A.fi0 1 0/117 IK 1 /10a An \ Ac 1 m 

lrDUUJlj4r ID.oo 1.4oe-09 143-193 


2017 


PRO 1297 


v-ouiicin lybio protein signature 1 


pp Al 9Q*7 A & Af\ tZ AOa AO 1 £. on 


2018 


PR00205 


VyaUIlCilu olgLlalUTC IV 


PPAAOA^T^ 10 OO O OC n 1 C 11 C/Z 

rKUUZUjJJ lz.Zz J.zje-lO 37-jO 

PR00205G 13.05 1.37e- 13 40-57 
PR00205F 19.57 3.10e-13 6-32 
PR00205C 13.59 6.62e-09 23-35 


2020 


IPB001862 


IVJLClIlt/lailC alid-L'K CUiIipiCA 

components/perforin/complement C9 


TPRAA 1 C/^O/^ OA AQ Q Q/1 a AA 1 1 O 1 1 

irJouuioozL, zo.4o o.y4e-uy 1 13-lol 


2021 


IPB001909 


KRABbox 


IPB001909 17.37 8.65e-30 56-90 


2022 


IPB001909 


KRAB box 


IPB001909 17.37 8.65e-30 56-90 


2024 


IPB000560 


Histidine acid phosphatase 


IPB000560 17,02 1.00e-16 35-57 


zuzo 


tptjaaaqoo 


zinc linger, C2H2 type 


IPB000822 14.67 7. OOe-24 545-570 


zuzo 


tpraoi ono 
lroUUiyuy 


iVKAJtJ OOX 


IPB001909 17.37 2.86e-21 134-168 
IPB000822 14.67 2.29e-17 573-598 
1PB000822 14.67 3. 57e-17 487-5 12 
IrBOOOozz 14.67 2.50e-13 515-540 


2026 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 4.32e-l 1 570-583 

DDAAA/fOA A C\A C O^-, 11 AO A A C\~1 

rKUUU4oA y.yn j.zoe-1 1 484-497 

PPAAA/IO A O QA O CO- 1 1 C/IO CCC 

rfs.uuu4oA y.y4 y.jje-i i j4z-x>*> 
PR00048B 5.52 1.00e-10 558-567 

PRftfl04RA Q Oil 1H ^19 ^9^ 
xr JvuUl/H-o/Y y.y^f J.oOe-lU jiZ-jZj 


2026 


IPB001012 


UBX domain 


IPB001012A 12.95 7.00e-10 297-312 


2026 


IPB001580 

x. \. u \J W 1JUV 


Vxcui^uvutiii xcumiy 


TPRflfi 1 *sRAP 9 OO 1 AAo AO 0 AC 0 1 A 

irmjuijour z.yj i.uue-uy 3Ujo14 
PR00048B 5.52 6.50e-09 500-509 


2026 


PRO 1073 


Prc^pnilin 1 ^ionattirf» TH 

l t vov/lllllll l al^UdLLUC ill 


pRni07^r t i ^ ^9^ no oaa oi i 
rivuiu/jL i.HD o.oze-uy juu-jii 


2026 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D 2.13 9.73e-09 298-322 


2029 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599L 18.66 4.15e-28 59-86 


2029 


TPB001114 

iir duv 1 1 j*t 


"Motrin f~*-tf*rm in no" 


FPDAA1 IIAf** 1*7 CO >1 1 la 10 *TO O/C 

lrxJUUllJ4C 1 /.cSz 4. Ue-13 /z-oo 

IPB001599K 8.15 1.46e-10 29-40 1 


90^1 

AvJ 1 


PRonni4 


rioronectin type 111 repeat signature 
IV 


DD AAA1 AT*\ 1 C 1 O C 0/C« 1 A 1 "7 11 

FKUUU14U 1D.12 J.zoe-10 17-31 


20^2 


TPROflOdSl 

1* OV/UvtOJ 


1 All/^mo ri/^n ff^rVf**"! 4* farm tnn 1 

Leucine ncn repeat ^/-terminal 
domain 


TDDAAA/lOO 11 1 O £L OC„ 1 O 1 1 O 1 T> 

lrh5UUU4o3 1 1, 18 o.85e-13 118-132 


2012 


PR00019 


L^eucine-ncn repeat signature i 


DD AAA 1 A A 11 TO *7 1 A ~ 11 vf /\ 

rKUUUiyA 11.72 7.1 4e-l 1 27-40 
PR00019B 11.42 8.09e-09 24-37 


2033 




Leucine ncn repeat i^-terminai 
domain 


TDDAAA/lOO 11 1 O OC« 11 1 1 O 1 1 

1PB000483 11.18 6.85e-l3 H8-132 


2033 


PR00019 


Leucine-rich repeat signature I 


PR00019A 11.72 7.l4e-ll 27-40 
PR00019B 11.42 8.09e-09 24-37 


2034 


IPB000203 


GPS domain 


IPB000203A 18.40 9.25e-20 991-1021 
IPB000203B 13.98 8.88e-15 1111-1132 


2034 


IPB000832 


G-protein coupled receptors family 2 
(secretin-like) 


IPB000832C 19.53 9.46e-13 1111-1140 


2034 


PR00249 


Secretin-like GPCR superfamily 
signature III 


PR00249C 15.44 L73e-10 1113-1136 
IPB000832G 15.17 7.81e-09 1281-1306 


2035 


IPB000822 


"Zinc finger, C2H2type" 


IPB000822 14.67 3.45e-21 51-76 
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IPB000822 14.67 4.00e-19 79-104 
IPB000822 14.67 3.40e-16 23-48 




prooo4r 


C2H2-type zinc finger signature I 


PR00048A 9.94 6.54e-14 20-33 


2035 


IPB001275 


DM DNA binding domain 


IPB001275 19.17 8.05e-14 11-50 
IPB001275 19.17 2.14e-13 39-78 
PR00048B 5.52 4.00e-il 92-101 
PR00048A 9.94 6.21e-l 1 76-89 
PR00048B 5.52 6.25e-l 1 64-73 
PR00048A 9.94 5.09e-10 104-117 
PR00048B 5.52 2.00e-09 8-17 
IPB001275 19.17 4.53e-09 67-106 
PR00048A 9.94 8.12e-09 48-61 


2035 


PR00995 


36kDa capillovirus serine protease 
) signature v i 


PR00995F 16.50 9.73e-09 1-19 


2038 


PR 00040 
x i\\j\j\j'Ty 


Wilm's tumour protein signature IV 


PR00049D 0.00 8.71e-10 8-22 
PR00049D 0.00 9.43e-10 9-23 


2038 


rPROO^Sfil 

XX JJUUJ OUi 


CrH- protein 


IPB003861B 9.06 l.98e-09 17-31 

TUT} AAA/1 HTS f\ f\f\ *S 0<*T r\ r\ « a « y 

PR00049D 0.00 2.37e-09 12-26 
rKUUU4VD O.U0 2.53e-09 1 1-25 
PR00049D 0.00 4.36e-09 10-24 


2038 


IPB002999 


Tufinr dnm nin 


irayjvzyyyts 7.30 7.55e-09 13-21 

rPR009QQQT3 *7 Kfl H <TC« An i a 01 

lrxJuuzyyyjD /.du /.jje-Uy 14-22 
TPR009QQQR 7 ^fi 5? no 11 in 


2038 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 8.88e-09 199-224 


2039 


IPB001310 


HIT fHistidine triads familv 


irDUuiJiUA lo. /o j.zje-lo 197-227 

fPROOl^lOR 91 OO 9 Ola n 0£i oot 

irDuuijiuD zi.uu z.yje-iz zoi-zo/ 


2039 


PR00332 


Histidine triad famtl v <?ianatnrp TT 


PROOH9T3 1 4 Al £ 0£n l a oao on 

riwi/jjzrj i^.uz o.zoe-iu ZUy-Zz/ 


2039 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 2.13e-09 339-364 


2040 


IPB001310 


HIT (Histidine triad) family 


IPB001310A 18.76 3.25e-18 197-227 
IPB001310B 21.00 2.93e-12 261-287 


2040 


PR00332 


Histidine triad family signature II 


PR00332B 14.02 6.26e-10 209-227 


2040 


IPB 000822 


"7inr fiturpr P9H9 tvnp" 
^fixiv xuigci, v_/znz type 


lrbOOUozz 14.67 2.13e-09 339-364 


2041 


IPB001310 


HIT (Histidine triad) family 


IPB001310A 18.76 3.25e-18 197-227 
IPB001310B 21.00 2.93e-12 261-287 


2041 


PR00332 


Histidine triad family signature II 


PR00332B 14.02 6.26e-10 209-227 


2041 


FPR000899 


z.inc tmger, CzHz type 


IPB000822 14.67 2.13c-09 339-364 


2042 


IPB000135 


High mobility group proteins HMG1 
ana Jnivivjz 


IPB000135D 2.13 4.52e-l0 102-126 
IPB000135D 2.13 9.71e-10 104-128 
IPB000135D2.13 9.90e-10 101-125 
IPB000135D2.13 3.18e-09 105-129 
IPB000135D 2.13 9.55e-09 103-127 


2043 


PR 00074 


Prptein-lysine 6-oxidase precursor * 

CKTnottirP \/TTT 
blgltdtlUG Vlil 


PR00074H 17.29 8. lie- 19 264-283 
PR00074E ll.34 3.88e-l6 193-213 
PR00074F 1 1.47 6.65e-l6 217-238 
PR00074B 7.56 4.98e-12 126-150 


2043 


IPB001695 


Lysyl oxidase 


IPB001695E 9.12 5.70e-12 110-151 
PR00074D 21.66 2.94e-10 171-192 


2043 


PR00258 


Speract receptor signature I 


PR00258A 13 56 1 70e-10 S-91 
PR00258C 9.05 4.95e-10 43-53 
PR00258D 14.29 6.29e-l0 76-90 
IPB001695F ll.l06.24e-09 151-179 


2046 


PRO 1254 


Prostaglandin D synthase signature II 


PR01254B 12.05 1.17e-09 339-349 


2048 


IPB000374 


Phosphatidate cytidylyltransferase 


IPB000374B 15.86 2.06e-27 375-402 
IPB000374A 12.59 3.65e-16 271-283 


2049 


PR00320 


G protein beta WD-40 repeat 
signature I 

.... 


PR00320A 13.15 7.95e-ll 118-132 
PR00320B 12.82 2.08e-10 1 18-132 
PR00320C 12.32 4.33e-09 118-132 
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2052 


PR01446 


Claudin~8 signature III 


PR01446C 9.62 2.27e-09 119-131 


2053 


IPB002884 


PrODrotein CAnvertase P-Hnmain 


lrpuuzott^o O.o9 6.33e-09 H4-131 


2054 


IPB000361 


Hypothetical hesB/yadR/yfhF family 


IPB000361B I9.l4 3.08e-l9 122-153 
IPB000361A 17.83 2.71e-16 73-93 


2055 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 9.28e-10 133-170 


2055 


IPB000920 


Myelin P0 protein 


IPB000920C 15.78 3.92e-09 161-213 


2055 


PR00213 


Myelin P0 protein signature V 


PR00213E 5.51 8.97e-09 179-203 


2058 




C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442A 26.12 3.17e-17 27-79 
IPB001442A 26.12 3.60e-17 33-85 
IPB001442A 26.12 1.21e-16 39-91 


2058 


IPB000885 


Fibrillar collagen C-terrninal domain 


IPB000885B 19.15 2.19e-16 35-88 
IPB000885A 11.46 5.06e-16 40-77 
IPB001442A 26.12 6.02e-16 30-82 
IPB000885B 19.15 3.65e-15 44-97 
IPB000885B 19.15 4.39e-15 26-79 
IPB000885B 19.15 4.49e- 15 32-85 
IPB001442A26.12 9.29e-15 24-76 


2058 


PR00453 


Von Willebrand factor type A 
domain signature I 


PR00453A 11.78 1.75e-14 107-124 
IPB000885A 11.46 2.29e-14 43-80 
IPB000885A 11.46 3.92e-14 52-89 
IPB000885B 19.15 6.97e-14 29-82 
IPB001442A 26.12 7.65e-14 42-94 
IPB001442A 26.12 8.63e-14 45-97 
IPB001442A 26.12 1.00e-13 36-88 
IPB000885A 1 1.46 2.89e-13 37-74 
IPB000885A 11.46 6.33e-13 49-86 
IPB000885B 19.15 7.07e-13 38-91 
IPB000885B 19.15 7.46e-13 41-94 


2058 


IPB001073 


Complement Ciq protein 


IPB001073A 22.14 1.72e-12 45-79 
IPB000885A 1 1.46 5.93e-12 55-92 
IPB0O0885A 11.46 6.04e-12 46-83 
IPB001073A 22. 14 7.48e-12 48-82 
IPB000885B 19.15 7.84e-12 23-76 

f DDAHAOOCD in 1C O OO 1 o a t 1 /\r\ 

IrbUUUooSB 19.15 o\88e-l2 47-100 
lrt>UU144iir5 Iz.jo 9.o5e-12 61-81 


2059 


IPB001541 


SUR2-tvne hvHrAYvlasp/nVcatiirac** 
catalytic domain 


lrDUUi341A 12.3U 5.50e-l 1 40-52 
TPPW101 R 11 A AQ 1 0*7 1 1£ 

lrDuuij'f id i i.oo 4.ooe-uy i//-UO 


"2060 


IPB003006 


Immunoglobulin and major 

histocomoatibilitv rnmnlpY domain 


IPB003006B 20.23 6.19e-09 134-171 


2061 


PR00918 


Calicivirus non-structural polyprotein 
familv signature \ 


PR00918A 13.81 3.59e-12 37-57 


2061 


IPB002078 


Sigma-54 factor interaction protein 
family 


IPB002078A 20.43 6.31e-10 43-77 


2061 


PR00364 


Disease resistance protein signature I 


PR00364A 8.29 7.1 le-10 42-57 


2061 


IPB000765 


GTP1/OBG familv 


rD"QAAA*7£ r ni T 1 a /I 1 o a 

lroUUU7o5 26.91 7.67e-10 41-84 


2061 


PR00094 


Adenylate kinase signature I 


PR00094A 9.62 2.43e-09 44-57 


2061 


PR00830 


Endooentidase La (Lon^ serine 
protease (SI 6) signature I 


ppnnRinA a <o a ^ Ao no /it /Tic 
rxv.uuojUA. o.jz *f.jue-uy 4/-oo 


2067 


PR00874 


Fungi-IV metallothionein signature 
III 


PR00874C 4.37 6.50e-09 7-21 


2071 


PR01539 


Interleukin-1 receptor type II 
precursor signature IX 


PR015391 14.65 9.06e-09 223-246 


2074 


IPB001284 


Ribosomal protein L34e 


IPB001284A 18.97 3.48e-31 15-50 
IPB001284B 26.99 1.41e-28 53-85 


2074 


PR01250 


Ribosomal protein L34 signature IV 


PR01250D 13.87 2.69e-23 73-95 
PR01250B 13.36 7.92e-17 33-50 
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PR01250A 11.25 2.25e-13 20-33 
PR01250C 9.53 4.52e-12 53-63 
IPB001284B 26.99 3.75e-09 82-114 


2076 


IPB000171 


Bacterial-type phytoene 
dehydrogenase 


IPB000171E 7.19 8.20e-09 294-304 


2077 


IPB001774 


Delta serrate ligand 


IPB001774D 19.23 5.91e-09 50-96 


2077 


IPB000034 


LamininB 


IPB000034C 12.97 7.31e-09 84-102 


2077 


IPB000561 


EGF-like domain 


IPB000561 4.89 8.07e-09 84-92 


2078 


IPB001774 


Delta serrate ligand 


IPB001774D 19.23 5.91e-09 50-96 


2078 


IPB000034 


LamininB 


IPB000034C 12.97 7.31e-09 84-102 


2078 


IPB000561 


EGF-like domain 


IPB000561 4.89 8.07e-09 84-92 


2079 


IPB001774 


Delta serrate ligand 


IPB001774D 19.23 5.91e-09 50-96 


2079 


IPB000034 


Laminin B 


IPB000034C 12.97 7.31e-09 84-102 


2079 


IPB000561 


EGF-like domain 


IPB000561 4.89 8.07e-09 84-92 


2080 


PR00436 


Interleukin-8 signature I 


PR00436A 15.20 9.36e-10 14-37 


2081 


IPB001187 


Tissue Factor (TF) 


IPB001187G 15.20 7.00e-10 33-69 


2081 


IPB001073 


Complement Clq protein 


IPB001073A 22.14 2.69e-09 146-180 


2081 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 6.03e-09 205-219 
PR00049D 0.00 6.34e-09 207-221 
PR00049D 0.00 7.41e-09 203-217 


2081 


PR00499 


Neutrophil cytosol factor 2 signature 
I 


PR00499A 7.48 7.60e-09 791-808 


2081 


IPB001359 


Synapsin 


IPB001359H 22.58 8.08e-09 772-822 


2081 


IPB003036 


Gag P30 core shell protein 


IPB003036C 11.53 9.63e-09 155-171 


2082 


IPB001039 


"Major histocompatibility complex 
protein, Class I" 


IPB001039R 27 3 01p-00 IfR 


2083 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.71e-12 148-185 
IPB003006B 20 ?T Q 14p-17 441 47R ' 
IPB003006B 20.23 1.00e-ll 248-285 


2083 


PR01536 


Interleukin-1 receptor type I and type 
II family signature III 


PR01536C 19.92 9.23e-l 1 547-570 
IPB003006B 20.23 6.40e-10 54-91 
IPB003006B 20.23 9.64e-10 540-577 
IPB003006B 20 23 8 62e-0Q 146-'* 81 
PR01536C 19.92 9.19e-09 155-178 


2084 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.71e-12 148-185 
IPB003006B 20 23 9 1 4e- 1 2 44 1 -47R 
IPB0O3O06B 20.23 1.00e-ll 248-285 


2084 


PR01536 


Interleukin-1 receptor type I and type 
II family signature III 


PR01536C 19.92 9.23e-l 1 547-570 
IPB003006B 20.23 6.40e-10 54-91 
IPB003006B 20.23 9.64e- 10 540-577 

PR01536C 19 92 9 19e-09 H 5-1 78 


2085 


1PB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.71e-12 148-185 
IPB003O06B 20.23 9.14e-12 441-478 
IPB003006B 20.23 l.00e-ll 248-285 


2085 


PR01536 


Interleukin-1 receptor type I and type 
II family signature III 


PR01536C 19.92 9.23e-l I 547-570 
IPB003006B 20.23 6.40e-10 54-91 
IPB003006B 20.23 9.64e-10 540-577 
IPB003006B 20.23 8.62e-09 346-383 
PR01536C 19.92 9.19e-09 155-178 


2086 


IPB002117 


p53 tumor antigen 


IPB002117A9.71 5.50e-15 13-23 


2087 


IPB000074 


Apolipoprotein AI/A4/E 


IPB000074B 29.17 7.49e-10 117-170 
IPB000074B 29.17 8.75e-10 95-148 
IPB000074B 29.17 9.20e-10 62-115 
IPB000074C 22.23 2.62e-09 90- 127 1 
IPB000074C 22.23 4.35e-09 112-149 
IPB000074B 29. 17 8.48e-09 201-254 s 
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2088 


IPB000074 


Apolipoprotein A1/A4/E 


IPB000074B 29.17 7.49e-10 117-170 
IPB000074B 29.17 8.75e-10 95-148 
1PBUUU074B 29.17 9.20e-10 62-1 15 

TDIJAAAA*7./i T> o 1 o r\r\ r\r\ 1 

iriJUUUU/4C 22.23 2.62e-09 90-127 
IPB000074C 22.23 4.35e-09 112-149 
IPB000074B 29.17 8.48e-09 201-254 


2090 


IPB001211 


Phosoholina^e A 2 


irnuuiziUB 17.16 3.l2e-3l 49-76 


2090 


PR00389 


Phospholipase A2 signature III 


PR00389C 17.85 2.50e-20 61-79 
PR00389B 10.67 6.91e-16 42-60 
IPB001211D 11.66 5.50e-14 109-124 
PR00389E 13.06 8.20e-14 109-125 
IPB001211C 14.62 1.56e-ll 84-102 


2091 


PR01217 


Proline rich extensin signature VI 


PR01217F 4.24 8.40e-09 65-82 


2092 


IPB001354 


Mandelate racemase/muconate 
lauiunizing enzyme iamuy 


IPB001354C 32.55 1.00e-24 255-296 
IPB001354D 32.92 2.07e-18 343-388 
IPB001354B 18.16 3.91e-18 132-158 


2094 


IPB000222 


n uicixi pnospnaiase lk^ suDiamuy 


IPB000222F 19.87 4.94e-15 285-305 
1FB000222E 14.28 6.33e-15 257-275 
IPB000222G9.17 1.95e- 12 311-324 

TT>T3AAA / ^0'^/ -,, C OA o AO- 11 -it/- 1 or- 

irDUUUzzZC o.o4 2.08e-l2 176-185 
IPB000222H9.33 7.97e-12 347-359 
IPB000222B 15.80 2.86e-10 144-154 
IPB000222D 11.74 2.74e-09 215-232 

TPRAAAOOOT Q Q1 >f *70« AO vino ^1^? 
XroUUUZZzl o.y L 4. /2e-l?9 4Uo-417 


2095 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPRAAA1 C Q/C /I Tl a K 1 m 1 
irowUlJZ o.0O4./ie-lD 1U/-122 

irouuui-)z o.oo 1.4 /e- 14 44-jy 


2095 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881B 12.28 1.47e-ll 107-118 


2095 


IPB000033 


"Low-density lipoprotein (Idl) 
receptor, YWTD repeat" 


IPB000033B 7.05 4.96e-l i 49-59 
IPB001881B 12.28 6.68e-U 44-55 


2095 


PR00010 


Type II EGF-like signature III 


PR00010C 6.98 7.10e-10 49-59 

PP AAA 1 (\C^ & QQ *7 <CQ« ia i in 1 

rKUUuiuc o.yo /.ooe-lU 112-122 
IPB001881B 12.28 2.57e-09 5-16 

TPRnnnn^^R n c\< i n* no in i no 
lrouuuujjD /.uj j.ue-uy 112-122 


2095 


IPB003886 


Extracellular domain in nidogen 


IPB003886D 13.91 5.71e-09 107-126 


2096 


PR00245 


Olfactorv reeentor ^ionatiirp TTT 


rKUU24jU 14.05 9.53e-17 218-234 


2096 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11.56 9.25e-l4 160-171 
rKU0245D 9.34 1.53e-13 278-287 
rrvvuz^fjc, o.yo O.ole-12 J23-J30 
PR00245B 13.73 1.00e-10 171-183 
IPB000276D 9.40 3.08e-09 324-340 


2096 


PR00237 


Rhodopsin-like GPCR superfamily 
signature V 


PR00237E 13.03 3.83e-09 241-264 


2096 


PR00534 


Melanocnrtin r&rpntnr familv 

iviviauvvul llll lvvCL/LUI idllllly 

signature I 


rK00DJ4A 12.77 5.17e-09 93-105 
PR00237C 14.77 5.91e-09 146-168 


2096 


PR00896 


Vasonres^in rpc<*nfrir QicmsitiiiY* IT 


DDAAQA/TD A "1£L *7 n« Ari n*-! i aq 

I'KUUoyoo 9.3o 7.23e-09 97-108 
PR00237G 19.23 1.00e-08 314-340 


2097 


PR00245 


OlTJlPfnrv T&f*f*T\tf\r CKmotnro FFf 

vyixauLUi y icccpiui Mgndiure III 


FKU0245C 14.65 9.53e-17 218-234 


2097 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11.56 9.25e-14 160-171 
PR00245D 9.34 1.53e-13 278-287 
PR00245E 8.96 6.81e-12 325-336 
PR00245B 13.73 1.00e-10 171-183 
IPB000276D 9.40 3.08e-09 324-340 


2097 


PR00237 


Rhodopsin-like GPCR superfamily 
signature V 


PR00237E 13.03 3.83e-09 241-264 


2097 


PR00534 


Melanocortin receptor family 
signature I 


PR00534A 12.77 5.17e-09 93-105 
PR00237C 14.77 5.91e-09 146-168 


2097 


PR00896 


Vasopressin receptor signature II 


PR00896B 9.36 7.23e-09 97-108 
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2098 
2098 

2102 


IPB001169 
PR01186 

PR00193 


"Intetmn heta. 0-termmn<: M 
Myosin heavy chain signature in 


FK00237G 19.23 1.00e-08 314-340 
IrbUOl 169J 7.42 4. 63 e- 10 49-62 
PR01 186K 7.39 7.27e-10 49-62 
PR01 186K 7.39 9.75e-09 15-28 


2102 


IPB000857 


Core domain in kinesin and myosin 
motors 


PR00193C 11.66 9.77e-24 126-153 
IPB000857C 10.82 4.84e-19 124-146 
PR00193B 12.36 6.81e-18 74-99 
IPB000857D 12.93 7.64e-12 153-191 
PR00193A 14.87 8.50e-12 14-33 
IPB000857B 11.35 1.00e-10 55-101 


2102 


PR00364 


i^ioccuc icMbidnce proiem signature l 


PR00364A 8.29 4.86e-09 76-91 


2103 


PR00193 


Myosin heavy chain signature 111 


PR00193C 11.66 9.77e-24 126-153 


2103 


IPB000857 


Core domain in kinesin and myosin 
motors 


IPB000857C 10.82 4.84e-19 124-146 
PR00193B 12.36 6.81e-18 74-99 
IPB000857D 12.93 7.64e-12 153-191 

DDAA 1 AO A 1 A a "t o r- r\ * rs « a mm 

rKU0193A 14.87 8.50e-12 14-33 
irJ3UUUoj/h> U.35 1.00e-10 55-101 


2103 


PR00364 


Disease resistance protein signature I 


PR00364A 8.29 4.86e-09 76-91 j 


2105 


IPB002350 


■ Lv<x£ - ai i /i-' c ociuic piuicabc lnnioiior 
family 


irmK)2350 31.78 2.86e-18 77-117 


2105 


IPB000716 


lu^iu^iuuuiin iypc-i repeal 


IPB000716C 17.62 2.88e-18 274-292 

TDI3AAA*71 CT\ 1 C /A t 1 /f i r* <"»o.y •-» -i y\ 

irtJUUU/loD 15.49 7.16e-15 296-310 j 


2109 


IPB000483 


Leucine rich repeat C-terminal 
domain 


IPB000483 1 1.18 5.50e-13 45-59 


2111 


IPB000221 


Protamine P1 


lrtsUUUZZl 5.48 3.08e-09 3-29 


2112 


PR01415 


Ankyrin repeat signature II 


PR01415B 10.23 5.88e-09 26-38 


2113 


IPB000416 


Outer Oan<?iH nrntfin VP A 
(Hemagglutinin) 


I DD AAA/1 1 /Cn 1 C OT T rtrt inn 

lrt$000416P 15.37 7.00e-09 188-226 


2114 


IPB000416 


Outer Cansid nmt^in VP4 
(Hemagg! utinin) 


ID D AAA/1 1 /f n if o*7 T /\rv ion m« r- 

lrm)UU41oP 15.37 7.00e-09 188-226 


2115 


IPB000998 


MAM domain 


IPB000998C 18.63 1.95e-12 17-32 


2115 


PR00020 


MAM domain signature III 


PRnnnonp 1 io ai q n. ia i/r it 
rixuuuzuo lz.m o. l/e-10 16-27 

IPB000998D 18.66 9.61e-10 82-105 


2116 


IPB000998 


MAM domain 


irouuuyycu lo.oJ l,95e-12 17-32 


2116 


PR00020 


MAM domain signature III 


PR00020C 12.01 8.l2e-10 16-27 
irDUiiuyyou lo.oo 9.6le-l0 82-105 


2118 


IPB002642 


Lysophospholipase catalytic domain 


IPB002642E 18.19 6.91e-10 86-111 


2119 


IPB002642 


LrVSODhostlholinasp pntnlvtir Hnmoin 


irDUUzo4zb lo.ly o.91e-10 86-111 


2120 


IPB000817 


Prion protein 


IPB000817A 8.34 7.73e-10 255-297 


2120 


IPB001442 


\-* LCiiiiuidi idnucm repeaieo oomain 
in type 4 procollagen 


IPB001442A 26.12 7.26e-09 262-314 


2122 


IPB003006 


iiiiixiuiiugiuuuiiii duu major 
histocompatibility complex domain 


1PB003006B 20.23 I,43e-13 72-109 i 


2122 


1PB003531 


Short hematopoietin receptor family 


IPB003531C 15.87 9.38e-ll 318-335 1 


2123 


IPB003006 | 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 1.43e-13 72-109 


2123 


IPB003531 


Short hematopoietin receptor family 
1 


IPB003531C 15.87 9.38e-ll 318-335 


2124 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 1.43e-13 72-109 


2124 
2125 


IPB003531 
IPB000008 


Short hematopoietin receptor family 
C2 domain 


IPB003531C 15,87 9.38e-ll 318-335 
IPB000008C 23.37 7.94e-25 109-148 


2125 
2125 


PR00360 
PR00399 


C2 domain signature I 
Synaptotagmin signature II 


PR00360A 15.18 1.60e-13 107-119 
PR00399B 14.30 1.69e- 12 94-107 
IPB000008D 14.83 3.86e-ll 164-182 
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PR00360B 11.64 5.94e-ll 136-149 
PR00399C 15.89 4.98e-10 151-166 
PR00399D 12.72 6.33e-10 171-181 
PR00399A 15.05 8.65e-09 79-94 


2126 


IPB002870 


Reprolysin family propeptide 


IPB002870B 24.73 3.78e-14 142-180 


2126 


IPB001670 


Iron-containing alcohol 


IPB001670D 13.90 5.50e-09 158-173 


2130 


IPB001442 


C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442A 26.12 7.53e-26 8-60 


2130 


IPB000885 


Fibrillar collagen C-terminal domain 


IPB000885B 19.15 4.52e-24 1-54 
IPB000885B 19.15 2.38e-23 19-72 
IPB001442A 26.12 8.04e-23 11-63 
IPB001442A 26.12 8.83e-23 20-72 
1PB000885B 19.15 2.32e-22 4-57 
1PB001442A 26.12 2.93e-22 5-57 
IPB001442A 26.12 5.37e-22 17-69 


2130 


PR00258 


Speract receptor signature I 


PR00258A 13.56 6.32e-16 87-103 
IPB001442A 26.12 7.91e-16 26-78 
IPB000885A 11.46 1.49e-15 33-70 
IPB000885A 1 1.46 5.74e-15 24-61 
IPB0O0885B 19.15 5.98e-15 28-81 
IPB000885A 11.46 8.30e-15 9-46 
IPB000885A 11,46 2.99e-14 30-67 
IPBOO0885B 19.15 4. 13e-14 31-84 


2130 


IPB001073 


Complement Clq protein 


IPB001073A 22.14 8.40e-14 17-51 
IPB000885A 1 1.46 8.60e-14 39-76 
IPB000885B 19.15 2.17e-13 34-87 
IPB001073A 22.14 7.89e-13 23-57 
PR00258B 7.94 8.42e-l3 106-117 
IPB001442A 26.12 2.17e-12 35-87 
1PB001442B 12.38 2.98e-12 24-44 
IPB001442B 12.38 5,58e-12 21-41 
IPB001073A 22.14 6.94e-12 20-54 
IPB001073A 22.14 8.38e-l2 11-45 
1PB001442A 26.12 8.47e-12 32-84 
IPB001442B 12.38 8.47e-12 12-32 
IPB001073A 22.14 8.74e-12 29-63 
IPB001442B 12.38 9.69e-12 15-35 
IPB001442B 12.38 1.71e-ll 51-71 
IPB001442B 12.38 2.86e-ll 9-29 
IPB001073A 22.14 3.83e-l 1 14-48 
IPB000885B 19.15 5.90e-ll 40-93 
IPB001442B 12.38 8.86e-ll 6-26 
IPB001073A 22.14 9.17e-ll 44-78 
IPB001073A 22.14 9.50e-il 2-36 
IPB001073A 22.14 1.15e-10 8-42 
IPB001073A 22.14 2.83e-10 26-60 


2130 


IPB000817 


Prion protein 


IPB000817A 8.34 2.88e-10 1-43 
IPB000885B 19 15 4 09e-10 17-00 
IPB000885A 1 1.46 4.23e-10 42-79 
IPB001073A 22.14 4.81e-10 47-81 
IPB001073A 22.14 5.12e-10 50-84 
IPB001073A 22.14 6.03e-10 5-39 
IPB001442A 26.12 9.26e-10 38-90 
IPB001442B 12.38 1.24e-09 18-38 
IPB001073A 22.14 2.13e-09 41-75 
IPB001442B 12.38 2.70e-09 3-23 
IPB001442B 12.38 4.65e-09 45-65 
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IPB001442B 12.38 5.62e-09 27-47 
IPB000885A 11.46 5.87e-09 45-82 
IPB001442B 12.38 6.84e-09 48-68 
IPB001073A 22 14 9 ^R-79 


2131 


IPB001442 


C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442A 26.12 7.53e-26 8-60 


2131 


IPB000885 


Fibrillar collagen C-terminal domain 


IPB000885B 19.15 4.52e-24 1-54 
IPB000885B 19.15 2.38e-23 19-72 
1PB001442A 26.12 8.04e-23 11-63 
IPB001442A 26.12 8.83e-23 20-72 
IPB000885B 19.15 2.32e-22 4-57 
IPB001449A 96 19 9 99 ^ <C7 

IPB001442A 26.12 5.37e-22 17-69 


2131 


PR00258 


Speract receptor signature I 


PR00258A 13.56 6.32e-16 87-103 
IPB001442A 26.12 7.91e-16 26-78 
IPB000885A 11.46 1.49e-15 33-70 
IPB000885A 11.46 5.74e-15 24-61 
IPB000885B 19.15 5.98e-15 28-81 
IPB000885A 11.46 8.30e-15 9-46 
IPB000885A 11.46 2.99e-14 30-67 
IPB000885B 19.15 4. 13e-14 31-84 


2131 


IPB001073 


Complement Clq protein 


IPB001073A 22.14 8.40e-14 17-51 
IPB000885A 11.46 8.60e-14 39-76 
IPB000885B 19.15 2.17e-13 34-87 
IPB001073A 22.14 7.89e-13 23-57 
PR00258B 7.94 8.42e-13 106-1 17 
IPB001442A 26.12 2.17e-12 35-87 
1PB001442B 12.38 2.98e-12 24-44 
IPB001442B 12.38 5.58e-12 21-41 
IPB001073A 22.14 6.94e-12 20-54 
IPB001073A 22.14 8.38e-12 11-45 
IPB001442A 26.12 8.47e-12 32-84 
IPB001442B 12.38 8.47e-12 12-32 
IPB001073A 22.14 8.74e-12 29-63 

[PR00144.9R 19 ^2 Q 6Qt* 10 t * i< 

IPB001442B 12.38 l.71e-ll 51-71 
IPB001442B 12.38 2.86e-ll 9-29 
1PB00 1073A 22. 14 3.83e-l 1 14-48 
IPB000885B 19. 15 5.90e-l 1 40-93 
IPB001442B 12.38 8.86e-ll 6-26 
IPB001073A 22.14 9.17e-l 1 44-78 
IPB001073A 22. 14 9.50e-l 1 2-36 
IPB001073A 22.14 1.15e-10 8-42 
IPB001073A 22.14 2.83e-10 26-60 


2131 


IPB000817 


Prion protein 


IPB000817A 8.34 2.88e-10 1-43 
IPB000885B 19. 15 4.09e-10 37-90 
IPB000885A 1 1.46 4,23e-10 42-79 
IPB001073A 22.14 4.81e-10 47-81 
IPB001073A 22.14 5.12e-10 50-84 
IPB001073A 22.14 6.03e-10 5-39 
IPB001442A 26.12 9.26e-10 38-90 
IPB001442B 12.38 i.24e-09 18-38 
IPB001073A 22.14 2.13e-09 41-75 
IPB001442B 12.38 2.70e-09 3-23 
IPB001442B 12.38 4.65e-09 45-65 
IPB001442B 12.38 5.62e-09 27-47 
IPB000885A 1 1.46 5.87e-09 45-82 
IPB001442B 12.38 6.84e-09 48-68 
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IPB001073A 22.14 9.30e-09 38-72 


2132 


IPB000237 


GRIP domain 


IPB000237B 30.66 3.22e-10 427-477 


2133 


IPB001909 


KRAB box 


IPB001909 17.37 6.50e-34 63-97 


2133 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 8.20e-22 354-379 
IPB000822 14.67 5.09e-21 438-463 
IPB000822 14.67 5.50e-20 606-631 
IPB000822 14.67 7.00e-20 578-603 
IPB000822 14.67 3.25e-19 522-547 
IPB000822 14.67 4.00e-19 326-351 
IPB000822 14.67 7.00e-19 410-435 
11*13000822 14.67 4.46e-18 494-519 
IPB000822 14.67 6.14e-17 382-407 
IPB000822 14.67 3.40e-16 550-575 
IPB000822 14.67 4.00e-16 466-491 


2133 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 5.85e-14 547-560 
PR00048A 9.94 8.07e-13 351-364 
PR00048A 9.94 3. 12e-12 519-532 
PR00048A 9.94 4.71e-12 379-392 
PR00048A 9.94 4.71e-12 463-476 
PR00048B 5.52 7.00e-12 619-628 


2133 


IPB001275 




IFB0U1275 19.17 7.04e-12 398-437 
PR00048A 9.94 7.88e-12 631-644 
PR00048A 9.94 1.95e-ll 603-616 
PR00048A 9.94 4.32e-ll 575-588 
PR00048B 5.52 5.50e-ll 451-460 
PR00048A9.94 l.OOe- 10 323-336 
IPB001275 19,17 1. 36e- 10 426-465 
IPB001275 19.17 1.49e- 10 482-521 
PR00048A 9.94 5.09e-10 435-448 
IPB00I275 19.17 5. 14e-10 510-549 


2133 


IPB002817 


ThiC family 


IPB002817H 1 1.39 5.42e-10 349-364 
PR00048A 9.94 5.91e-10 491-504 
IPB001275 19.17 8.18e-10 314-353 
IPB001275 19.17 9. 15e- 10 454-493 
1 J K00048B 5.52 9.36e-10 507-516 
IPB001275 19.17 9.39e-i0 342-381 
IPB001275 19.17 9.39e-l0 370-409 
PR00048B 5.52 2.00e-09 339-348 
IPB000822 14.67 2.13e-09 634-659 
PR00048B 5.52 2.50e-09 591-600 
IPB001275 19.17 2.7ie-09 594-633 

n T> A A A /I o T"J c ci o AA r\n n r e a a 

FR00048B 5.52 3.00e-09 535-544 
lrbUU1275 19.17 3. 62e-09 538-577 
PR00048A 9.94 4.38e-09 407-420 


2133 


IPB000306 


"FYVEZn-finger, 

rabr)hiIin/VP < W7/FARl tvnp" 
t aupniiiiu v r ot* in t\o 1 type 


IPB000306 8.96 4.7le-09 350-362 

DDHOA/tDD C CI c CA~ AA /Iti a t-\ 

rKUUU4oB 5.52 5.50e-09 423-432 
IPB000306 8.96 5.76e-09 630-642 
IPB000306 8.96 6.03e-09 434-446 
PR00048B 5.52 7.00e-09 367-376 
IPB002817H 11.39 7.34e-09 433-448 
IPB001275 19.17 8.18e-09 566-605 


2133 


IPB002634 


BolA-like protein 


IPB002634A 23.30 8.62e-09 375-409 


2137 


IPB000954 


Aminotransferase class-Ill pyridoxal- 
phosphate 


IPB000954B 21.02 9.38e-21 191-230 
IPB000954D 13.61 5.74e-l7 277-295 
IPB000954C 12.88 9.44e-14 240-255 


2138 


IPB000954 


Aminotransferase class-Ill pyridoxal- 
phosphate 


IPB000954B 21.02 9.38e-21 191-230 
IPB000954D 13.61 5.74e-17 277-295 
IPB000954C 12.88 9.44e-14 240-255 
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2139 


IPB001254 


"Serine nrnteases trvosin fhmilv" 


lrt>UUlZj4A y.ya O.I4e-15 33-49 


2139 


PR00722 


Chymotrypsin serine protease family 
(SI) signature I 


PR00722A 12.06 4.54e-14 34-49 


2139 


1PB000001 


Kringie 


IPB000001D 11.31 7.56e-12 33-49 


2139 


IPB000177 


Apple domain 


IPB000177K 13.19 2.57e-10 35-67 
PR00722B 12.69 6.85e-10 90-104 


91 AO 
Z14Z 


IrrSUUUI jZ 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 3.89e-ll 10-25 
IPB000152 8.86 4.86e-ll 128-143 


91 AO 

zi4/ 


TDDHA 1 OO 1 

IroUUlool 


Calcium-binding EGF-like domain 


IPB001881B 12.28 7.63e-ll 10-21 


2142 


PR00010 


Type n EGF-like signature III 


PR00010C 6.98 2.74e-10 133-143 


Z14Z 


1F13U02899 


EB module 


IPB002899B 11.81 5.59e-10 116-128 
IPB002899B 11.81 5.59e-10 157-169 
IPB001881B 12.28 6.57e-10 128-139 
IPB001881B 12.28 8.29e-10 169-180 
IPB001881A 8.72 9.36e-10 41-50 
IPB000152 8,86 9.72e-10 169-184 


91 AO 


TPRfifil R£9 


Membrane attack complex 
components/perforin/complement C9 


IPB001862F 29.39 9.81e-10 26-73 
IPB001862F 29.39 1.28e-09 102-149 


2142 


IPB000033 


"Low-density lipoprotein (ldl) 
receptor, YW ID repeat 


IPB000033B 7.05 5.03e-09 133-143 
PR00010A 12.91 7.27e-09 37-48 


2142 


IPB000561 


EGF-like domain 


IPB000561 4.89 7.43e-09 96-104 
IPB000561 4.89 7.43e-09 137-145 


01/1/1 
Z144 


lroUUUoUo 


Ubiquitin-conjugating enzymes 


IPB000608 27.71 7.95e- 12 72-116 


2146 


IPB002181 


Fibrinogen beta and gamma chains 
C-terminal globular domain 


IPB002181B 20.16 7.49e-24 30-66 
IPB00218 ID 29.18 7.32e-15 92-132 
IPB002181C 15.87 2.64e-10 71-83 


01 /I "7 

Zl4/ 




Fibrinogen beta and gamma chains 
C-terminal globular domain 


IPB002181B 20.16 7.49e-24 30-66 
IPB00218 ID 29.18 7.32e-15 92-132 
IPB002181C 15.87 2.64e-10 71-83 


2148 


IPB002181 


Fibrinogen beta and gamma chains 
C-terminal globular domain 


IPB002181B 20.16 7.49e-24 30-66 
IPB002181D 29.18 7.32e-15 92-132 
IPB002181C 15.87 2.64e-10 71-83 


2151 


IPB002027 


Amino acid permease 


IPB002027D 22.00 4.13e-25 248-287 
IPB002027C 19.67 2.74e-22 167-205 
IPB002027B 12.67 7.97e-l2 103-122 


2159 


PR00503 


Bromodomain signature IV 


PR00503D 19.24 3.57e-21 432-451 


2159 


IPB001487 


Bromodomain 

■- 


IPB001487B 17,44 2.13e-19 423-444 
PR00503B 10.44 4.37e-l9 105-121 
IPB001487A 11.44 5.20e-19 106-124 
PR00503C 19.09 4.00e-l7 121-139 
IPB001487A 11.44 9.53e-16*399-417 
PR00503A 14.57 4.00e-14 89-102 
PR00503B 10.44 8.64e- 14 398-414 
PR00503D 19.24 9.25e-13 139-158 
IPB001487B 17.44 1.58e-12 130-151 
PR00503C 19.09 6.70e-ll 414-432 


91 

z i jy 




Cu(A) centre of cytochrome c 
oxidase, sub unit II and nitrous oxide 
reductase" 


IPB001505B 15.93 5.94e-10 417-466 
IPB001505A 18 04 1 17e-09 104-151 


2159 


IPB003351 


Dishevelled specific domain 


IPB003351C 13.82 5.13e-09 496-535 
PR00503A 14.57 6.81e-09 382-395 


2159 


PR01217 


Proline rich extensin signature IV 


PR01217D 4.57 7.49e-09 250-271 


2159 


PRO 1503 


Treacher Collins syndrome protein 
Treacle signature II 


PR01503B 3.77 7.64e-09 714-727 


2159 


IPB000574 


Tymovirus coat protein 


IPB000574A 32.18 7.78e-09 265-312 


2159 


PR00910 


Luteovirus ORF6 protein signature I 


PR00910A 2.74 8.07e-09 266-278 


2159 


IPB001359 


Synapsin 


IPB001359H 22.58 8.51e-09 204-254 
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2159 


IPB001978 


Troponin 


IPB001978B 22 99 9 15<»-AQ 541 579 


2160 


IPB002862 


Protein of unknown function DUF16 


IPB002862C 1 1 10 9 59p.no 6n R9 


2164 


IPB000961 


Protein kinase C-terminai domain 


IPB000961D 21 71 S 9Qp-9Q 7 4R 


2164 


IPB001245 


Tyrosine kinase catalytic domain 


IPB001245B 21 68 9 RAo-IQ 1 1 40 


2164 


IPB000861 


PKN/rhophilin/rhotekin rho-binding 
repeat 


IPB000861G 13 11 9 6ftp-l6 11 rt? 


2164 


IPB001772 


Kinase associated domain 1 


IPB001779F 94 RR 9 95*» 1/1 AO 1AR 


2164 


IPB003527 


MAP kinase 


IPB003527fi 17 96 R R6o. 14 »1 1 1 R 
IPB001772D 21.67 4.73e-13 18-57 
IPB003527D 21.53 4.66e-ll 4-45 


2164 


DPB000095 


PAK-box /P21-Rho-bindin2 


TPRft0n09SP 16 47 0 65a 1A 15 AO 


2164 


IPB000959 


POLO box dunlicated re pi on 


TPR00095QD 97 01 9 07*> AO A9 114 


2165 


IPB000961 


Protein kinase C-terminal domain 

» l Wkvlll IUILCWW V-» Vwl ill llltli vl Will CI 111 


TPR0flfl96in 91 91 5 9Q/» 9Q 7_4fi 


2165 


IPB001245 


Tyrosine kinase catalytic domain 


IPB001245B 21.68 2.80e-19 11-49 


2165 


IPB000861 


PJCN/rhonhi1in/rhntf»k'in rhn-HinrHncr 

repeat 




2L65 


IPB001772 


Kinase associated domain 1 


IPB001772E 24.88 2.25e-14 69-108 


2165 


IPB003527 


MAP kinase 


IPB003527G 17.26 8.86e-14 81-118 
IPB001772D 21.67 4.73e-13 18-57 

TPRAA1597H 91 51 4 AAo 1 1 A AK 


2165 


IPB000095 


PAK-box /P21-Rho-binding 


IPB000095F 16.47 9.65e-10 15-69 


2165 


IPB000959 


POLO box duplicated region 


IPB000959D 27.01 2.97e-09 62-1 14 


2167 


IPB001073 


Complement Clq protein 


IPB001073B 20.88 6.00e-26 147-181 
IPB001073A22.14 4.48e-20 101-135 


2167 


IPB000885 


Fibrillar collagen C-terminal domain 


IPB000885B 19.15 9.63e-20 70-123 


2167 


IPB001442 


C-terminal tandem repeated domain 
in type 4 procollagen 


IPB001442A 26. 12 4.27e-19 71-123 
IPB000885B 19.15 7.48e-19 76-129 
IPB000885A 11.46 1.97e-18 78-115 
IPB000885A 1 1.46 2.94e-18 84-121 


2167 


PR00007 


Complement C1Q domain signature 
III 


PR00007C 16.13 3.67e-l 8 215-236 
IPB001442A 26.12 l.lle-17 80-132 
PR00007A 20.64 1.84e-17 140-166 
IPB001442A 26.12 1.87e-17 86-138 
IPB000885B 19.15 5.39e-17 73-126 
IPB000885A 11.46 6.96e-17 81-118 
IPB000885B 19.15 8.87e-17 67- 


2167 


IPB000817 


Prion protein 


IPB000817A 8.34 3.27e-09 67-109 
IPB000885A 11.46 3.66e-09 35-72 
IPB001442A 26. 12 4. 13e-09 28-80 
IPB000885B 19.15 4. 19e-09 42-95 
IPB000885A 1 1.46 4.77e-09 102-139 
IPB001442A 26.12 4.83e-09 40-92 
IPB001442B 12.38 5.99e-09 53-73 
IPB001442A 26.12 6.17e-09 37-89 
IPB000885B 19.15 7.55e-09 52-105 
IPB001442B 12.38 7.57e-09 87-107 
IPB001442B 12.38 8.54e-09 105-125 
IPB001073A 22 14 8 59e-09 46-80 
IPB0O0885B 19.15 8.69e-09 94-147 
IPB001442B 12.38 9.64e-09 90-110 


2169 


IPB002360 


Involucrin 


IPB002360C 15.36 3.06e-14 206-247 


2169 


PR00209 


Alpha/beta gliadin family signature II 


PR00209B 4.73 5.94e-12 226-244 
IPB002360C 15.36 5.93e- 10 215-256 
IPB002360C 15.36 2.50e-09 195-236 
IPB002360C 15.36 2.50e-09 214-255 


2169 


DPB001359 


Synapsin 


IPB001359H 22.58 5.19e-09 220-270 
IPB002360C 15.36 5.20e-09 203-244 
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IPB002360C 15.36 5.70e-09 212-253 
IPB002360C 15.36 6.10e-09 188-229 


2169 


IPB003753 


"Exonuclease VII, large subunit" 


IPB003753F 28.29 7.54e-09 181-231 
IPB002360C 15.36 8. 80e-09 218-259 


2170 


IPB002360 


Involucrin 


IPB002360C 15.36 3.06e-14 206-247 


Z\ /U 


PR00209 


Alpha/beta ghadin family signature II 


PR00209B 4.73 5.94e-12 226-244 
IPB002360C 15.36 5.93e-10 215-256 
IPB002360C 15,36 2.50e-09 195-236 
IPB002360C 15.36 2.50e-09 214-255 


11 /u 


1PB001359 


Synapsin 


IPB001359H 22.58 5.19e-09 220-270 
IPB002360C 15.36 5.20e-09 203-244 
IPB002360C 15.36 5.70e-09 212-253 
IPB002360C 15.36 6.10e-09 188-229 


2170 


IPB003753 


"Exonuclease VII, large subunit" 


IPB003753F 28.29 7.54e-09 181-231 
IPB002360C 15.36 8.80e-09 218-259 


III I 


TT1DAAA/I O 

1PB0UO483 


Leucine rich repeat C-terminal 
domain 


IPB000483 11.18 5.50e-13 45-59 


2173 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.83e-ll 69-106 


III J 


PR00457 


Animal haem peroxidase signature 
VII 


PR00457G 14.17 4.48e-14 144-164 
PR00457H 14.82 5.85e-13 215-229 
PR00457F 14.42 6.32e-12 17-27 


2176 


PR00457 


Animal haem peroxidase signature 
VII 


PR00457G 14.17 4.48e-14 144-164 
PR00457H 14.82 5.85e-13 215-229 
PR00457F 14.42 6.32e-12 17-27 


2177 


PR00457 


Animal haem peroxidase signature 

\/TT 
VII 


PR00457G 14.17 4.48e-14 144-164 
PR00457H 14.82 5.85e-13 215-229 
PR00457F 14.42 6.32e-12 17-27 


2179 


IPB002151 


Kinesin light chain repeat 


IPB002151B 14.23 8.01 e- 10 259-31 1 


2179 


IPB000421 


Coagulation factor 5/8 type C 
domain (FA58C) 


IPB000421A21.21 7.85e-09 62-81 


2180 


IPB003117 


Regulatory subunit of type II PKA R- 
subunit 


IPB003117C 17.01 1.00e-40 189-229 
IPB003117D 18.87 1.00e-40 240-280 
IPB003117G 17.45 8.50e-33 383-417 
IPB003117A 22.23 5.50e-26 66-98 
IPB003117E 18.84 5.85e-23 329-357 


2180 


IPB000595 


Cyclic nucleotide-binding domain 


IPB000595C 23.31 6.82e-2I 363-388 


2180 


PR00103 


cAMP-dependent protein kinase 
signature II 


PR00103B 10.32 7.00e-l8 215-229 
IPB000595B 15.72 7.50e-l8 321-344 
IPB003117F 17.26 l.OOe- 17 365-379 
IPB000595B 15.72 4.43e-16 203-226 
PR00103A 9.07 7.75e-16 200-214 
IPB003117C 17.01 2.96e-15 307-347 
IPB003117D 18.87 4.14e-15 364-404 
PR00103E 12.91 5.91e-14 397-409 
PR00103D 10.18 2.93e-13 376-387 
IPB000595C 23.31 4.60e-13 239-264 
rKUUlOiC 13.28 1.84e-ll 364-373 
PR00103D 10.18 2.98e-10 252-263 
IPB003117E 18.84 3.57e-10 199-227 
IPB003117E 18.84 5.43e-10 317-345 
IPB003117F 17.26 1.50e-09 241-255 
PR00103A 9.07 8.1 le-09 318-332 


2181 


IPB001478 


PDZ domain (also known as DHR or 
GLGF) 


IPB001478B 6.12 4.94e-09 49-58 j 


2182 


IPB000907 


Lipoxygenase 


IPB000907J 20.31 5.50e-37 499-541 
IPB000907G 22.23 1.87e-34 346-388 
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IPB000907F 2L29 1.00e-28 313-345 


2182 


PR00467 


Mammalian lipoxygenase signature 
VI 


PR00467F 12.25 9.41e-22 393-415 


2182 


PR00087 


Lipoxygenase signature III 


PR00087C 13.32 L39e-21 348-368 
IPB000907C 16.09 7.17e-21 195-221 
IPB0009071 27.52 7.16e-19 438-491 
IPB000907E 15,16 9.2ie-18 270-294 
PR00467D 17.16 9.57e-17 170-191 
IPB000907D 18.70 2.67e-16 236-263 
PR00467E 9. 17 1. 16e-15 267-286 
PR00087A 20.06 3.52e-15 310-327 
PR00087B 13.69 5.11e-15 328-345 
IPB000907B 14.10 2.50e-13 132-147 
PR00467A 8.38 3.29e-13 11-28 
IPB000907H 18.37 5.86e-13 409-425 
PR00467B 14.98 5.88e-12 57-76 
PR00467G 16.61 3.37e-l 1 554-571 
IPB000907A 16.20 4.21e-10 94-103 


2183 


IPB000907 


Lipoxygenase 


IPB000907J 20.31 5.50e-37 499-541 
IPB000907G 22.23 1.87e-34 346-388 
IPB000907F 21.29 1.00e-28 313-345 


2183 


PR00467 


Mammalian lipoxygenase signature 
VI 


PR00467F 12.25 9.41e-22 393-415 


2183 


PR00087 


Lipoxygenase signature III 


PR00087C 13.32 1.39e-21 348-368 
IPB000907C 16.09 7.17e-21 195-221 
IPB0009071 27.52 7.16e-19 438-491 
IPB000907E 15.16 9.2le-18 270-294 
PR00467D 17.16 9.57e-17 170-191 
IPB000907D 18.70 2.67e-16 236-263 
PR00467E 9.17 1.16e-15 267-286 
PR00087A 20.06 3.52e-15 310-327 
PR00087B 13.69 5.11e-15 328-345 
IPB000907B 14.10 2.50e-13 132-147 
PR00467A 8.38 3.29e-13 1 1-28 
IPB000907H 18.37 5.86e-13 409-425 
PR00467B 14.98 5.88e-12 57-76 
PR00467G 16.61 3.37e-ll 554-571 
IPB000907A 16.20 4.21e-10 94-103 


2184 


IPB000907 


Lipoxygenase 


IPB000907J 20.31 5.50e-37 499-541 
IPB000907G 22.23 1.87e-34 346-388 
IPB000907F 21.29 1.00e-28 313-345 


2184 


PR00467 


Mammalian lipoxygenase signature 
VI 


PR00467F 12.25 9.41e-22 393-415 


2184 


PR00087 


Lipoxygenase signature III 


PR00087C 13.32 1.39e-21 348-368 
IPB000907C 16.09 7.17e-2l 195-221 
IPB000907I 27.52 7.16e-19 438-491 
IPB000907E 15.16 9.21e-18 270-294 
PR00467D 17.16 9.57e-17 170-191 
IPB000907D 18.70 2.67e-16 236-263 
PR00467E 9.17 1.16e-15 267-286 
PR00087A 20.06 3.52e-15 310-327 
PR(X)087B 13.69 5.1 le-15 328-345 
IPB000907B 14.10 2.50e-13 132-147 
PR00467A 8.38 3.29e-13 11-28 
IPB000907H 18.37 5.86e-13 409-425 i 
PR00467B 14.98 5.88e-12 57-76 
PR00467G 16.61 3.37e- 11 554-571 
IPB000907A 16.20 4.21 e- 10 94-103 
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2193 


IPB001774 


Delta serrate ligand 


IPB001774C 18.25 1.71e-3l 37-79 
IPB001774D 19.23 3.32e-25 83-129 


2193 


PR00011 


Type III EGF-like signature IV 


PR00011D 12.12 4.57e-12 39-57 
IPB001774C 18.25 2.l5e-10 68-1 1U 


2193 


PR00010 


Type II EGF-like signature III 


PR00010C 6.98 3.90e-10 113-123 
PR0OOI IB 13.08 7.88e-10 39-57 


2193 


IPB000561 


EGF-like domain 


IPB000561 4.89 9.25e-10 46-54 


2193 


1PB001886 


Laminin N-terminal (Domain VI) 


IPB001886E 10.90 9.67e-10 44-60 


2193 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 6.21e-09 108-123 
PR00011A 14.05 6.88e-09 39-57 


2193 


IPB000034 


Laminin B 


IPB000034A 22.21 9.00e-09 96-131 


2193 


IPB001762 


Disintegrin 


IPB001762A 23.93 9.65e-09 126-166 


2195 


IPB000467 


Di 1 1/G-patch domain 


IPB000467 8.65 1.00e-08 329-339 


2197 


IPB002467 


"Methionine aminopeptidase, 
subfamily 1" 


IPB002467C 17.56 2.29e-30 184-212 
IPB002467B 12.68 2.50e-23 158-179 


2197 


PR00599 


Methionine aminopeptidase- 1 
signature II 


PR00599B 10.21 8.00e-I7 188-204 
IPB002467D 14.78 5.50e-15 257-282 
PR00599A 11.84 9.63e-14 166-179 
IPB002467F 18.38 1.58e-12 315-345 
IPB002467E 11.05 7.75e-12 290-302 
PR00599D 14.43 5.03e-10 288-300 
IPB002467A 15.75 2.87e,09 130-147 


2197 


IPB001131 


Proline dipeptidase 


IPB001131D 11.56 5.18e-09 290-303 
IPB001131B 18.96 8.10e-09 188-209 


2198 


1PB002889 


WSC domain 


IPB002889B 11.76 1.88e- 12 366-412 
IPB002889B 11.76 3.54e-ll 365-411 
IPB002889B 11.76 4.96e-10 367-413 
IPB002889B 11.76 6.84e-10 363-409 
IPB002889B 11.76 7.13e-10 362-408 
IPB002889B 1 1.76 4. 19e-09 357-403 


2198 


IPB003351 


Dishevelled specific domain 


IPB003351C 13.82 4.49e-09 372-411 
IPB002889B 11.76 4.56e-09 353-399 
IPB002889B 11.76 7.00e-09 355-401 
IPB002889C 9.89 8.52e-09 367-388 


2199 


PR00918 


Calicivirus non-structural polyprotein 
family signature I 


PR00918A 13.81 5.85e-ll 192-212 


2199 


PR00364 


Disease resistance protein signature I 


PR00364A 8.29 4.71e-09 197-212 


2199 


PR01102 


5-hydroxytryptamine 6 receptor 
signature XIII 


PR01102M 11.13 6.71e-09 1013-1035 


2199 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 7.71e-09 1021-1035 


2200 


IPB001478 


PDZ domain (also known as DHR or 
GLGF) 


IPB001478A 11.55 5.09e-09 61-71 
IPB001478B 6.12 L00e-08 79-88 


2202 


PR01286 


Orphan nuclear receptor NOR1 . 
signature V 


PR01286E 5.27 9.26e-09 322-343 


2203 


IPB000998 


MAM domain 


IPB000998D 18.66 1.96e-15 546-569 


2203 


1PB003886 


Extracellular domain in nidogen 


IPB003886D 13.91 8.77e-15 253-272 


2203 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 2.89e-14 126-141 


2203 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881B 12.28 5.00e-14 208-219 
IPB000152 8.86 1.00e-13 253-268 
IPB000152 8.86 1.82e-13 208-223 
IPB001881B 12.28 4.75e-13 126-137 


2203 


IPB001774 


Delta serrate ligand 


IPB001774C 18.25 9.13e-13 88-130 
IPB000998B 17.20 L00e-12 428-440 


2203 


PR00020 


MAM domain signature I 


PR00020A 20.48 2.88e-l 1 426-444 
IPB000998C 18.63 5. 30e- 1 1 483-498 
IPB001881B 12.28 8.58e-i I 253-264 
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ZZVJ 


r KUUyU / 


I nromDornouuiin signature u 


PRnnoo7R 1 1 ^n 9 44^-10. ifin.i7fi i 

l i\\j\jy\j i £> i i.ju i.HnC'iu 1UI/-1 /u 


2203 


IPB000561 


EGF-like domain 


IPB000561 4.89 3.25e-10 97-105 


2203 


IPB000033 


"Low-density lipoprotein (Idl) 
receptor, YWTD repeat" 


IPB000033B 7.05 5.35e-10 258-268 
IPB000033B 7.05 5.97e-09 213-223 


2203 


IPB000167 


Dehydrin 


T"D15AAA1 /C*7 A C <Q *7 1 A*. AO 1/fA *\£n 


2203 


IPB003367 


Thrombospondin type 3 repeat 


IPB003367A 1 1 ,78 9.79e-09 175-195 


2204 


1PB000998 


MAM domain 


lrHUUOyyoD I0.00 1.9oe-15 54o-:>oy 


2204 


IPB003886 


Extracellular domain in nidogen 


IPB003886D 13.91 8.77e-15 253-272 


2204 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 2.89e-14 126-141 


2204 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881B 12.28 5.00e-14 208-219 
IPB000152 8.86 I.00e-13 253-268 
IPB000152 8.86 l.82e-13 208-223 

mnr*Ai 001 1~% 10 10 <i 7f« 11 to^ 10*7 

IPBOOI00IB 12,2o 4. /5e-l3 Izo-lJ/ 


2204 


IPB001774 


Delta serrate ligand 


TnDrtAi*n^r« ioicai*i,«. 10 00 iia 

rDDAAAQOOD 1*7 OA 1 AAa 19 /19R /I/1A 
lrEiUuUyyoD 1 /.ZU l.UUe-lZ4Zo-44U 


2204 


PR00020 


. 

MAM domain signature I 


DDAAAOAA OA AC *) 05o 1 1 A9<_A/L4 
rKUUUZUA ZU.45 Z.ooe-1 1 4Z0-444 

TDtJAAAOQCr* 1 £ £3 ** T A<=» 1 1 /1R1_AQR 

ixjouuuyyo^ Io.oj j.jue-11 *toj-4yo 
IPB001881B 12.28 8.58e-ll 253-264 


2204 


PR00907 


Thrombomodulin signature II 


PR00907B 11.50 2.44e-10 160-176 


2204 


IPB000561 


EGF-like domain 


1"DDAAA«1 A CO 1 TCo 1A 09 1A*? 

irijuuujoi 4.oy j.zje-iu y /-iuo 


2204 


IPB000033 


Low-density lipoprotein (Idl) 
receptor, YW1U repeat 


TD'QAAAAI'i'D 7 A^ *i 1^<*_1A 9^R 9£R 
iroUUUUJjxJ /.UD J.Oje-IU ZDo-ZOo 

rpr*nfinn.m* 70^ < 07^-no 9 1 1-99*} 

IrtJUUUVfjJD /.UJ J.7/C-U7 aU-ZZj 


2204 


IPB000167 


Dehydrin 


IPB000167A 8.58 7.14e-09 340-367 


2204 


IPB003367 


Thrombospondin type 3 repeat 


IPB003367A 11.78 9.79e-09 175-195 


2205 


IPB002893 


MYND zinc finger (ZnF) domain 


irD\)\)ZQyj lu.zo 4.DZe-l/ OOj-ooI 


2205 


rPB00l664 


Intermediate filament proteins 


TDDAAI^/ID 1*7 A A fi 9A/» AO <£Q /^HJi 

lroUUioo4t> 1 / .44 o.zue-uy ooy-ouo 


2205 


IPB002889 


WSC domain 


Tt>r>AA*)COOQ 1 1 *7< ^ O /I « AO /lfifi ^1/1 

IrrJUUZficSyD 1 1. /u 0.j4e-uy 4oo-jj4 

IPB002889C 9.89 8.l2e-09 437-458 
IPB002889B 11.76 9.9 le-09 419-465 


2206 


IPB002893 


MYND zinc ringer (ZnrJ domain 


Tt>DAA*)Q01 1 A 9Q A **9*a 19 f\Sl1 


2206 


IPB001664 


Intermediate filament proteins 


TDTJAA1 ^A/1D 1*7 A/l OHo AO 'x^Q fiHfi 


2206 


IPB002889 


WSC domain 


TDQAA9QCOR 1 1 A l/l** HQ ARR 

IPB002889C 9.89 8.12e-09 437-458 
IPB002889B 11.76 9.91e-09 419-465 


2207 


IPB002893 


MYND zinc finger (ZnF) domain 


IrDUUZoyj 10. Zo 4.DZe-l/ OOJ-Ool 


2207 


IPB001664 


Intermediate filament proteins 


IPB001664B 17.44 6.20e-09 569-608 


2207 


IPB002889 


WSC domain 


TT»t>AA*JQOQD 1 1 *7/C ^ 1.Ao AO Afifi ^'XA 

IrBUUzooyhJ It. /O O.J4e-uy 4oo-jo4 

irDUUZooyu y.oy o. lze-uy 4^ /-4jo 


2208 


IPB002893 


MYND zinc finger (ZnF) domain 


IPB002893 16.28 4.52e-17 663-681 


2208 


IPB001664 


Intermediate filament proteins 


TT>DAA1«/1Q 1*7 AA A 9A» AO ^£0 (\f\Sl 

IrtJUUiuo4t> l / .44 o.zue-uy joy-ouo 


2208 


IPB002889 


WSC domain 


IPB002889B 11.76 6.34e-09 488-534 

TDOAAIQCGf O CO C 1 la. AO 417-4^2 

irDiHrZooyo y.sy o. ize-uy 4j /-4jo 
lrowZoo7D ii./o y.yie-uy *+iy-*foj 


2210 


PR00918 


— ; 

Calicivirus non-structural polyprotem 

n « nnn i., irn T 

family signature I 


T>T>AOQ1 RA 17R1 ^R*\*» 11 RR 10R 


991 a 


rlvUU jOH 




PR00364A 8.29 4.71e-09 93-108 


2211 


IPB001762 


Disintegrin 


IPB001762A 23.93 4.33e-23 19-59 


2211 


PR00289 


Disintegrin signature I 


PR00289A 14.29 I.16e-14 35-54 
1PB001762B 10.06 3.40e-12 66-76 


2211 


IPB001774 


Delta serrate ligand 


IPB001774C 18.25 5.31e-10 238-280 
PR00289B 1 1.74 3.80e-09 64-76 


2211 


IPB003306 


WIF domain 


IPB003306E 25.51 7.40e-09 215-260 


2212 


IPB000159 


RA domain 


IPB000159A 11.28 7.60e-10 115-124 


2212 


IPB00I359 


Synapsin 


IPB001359H 22.58 5.89e-09 108-158 


2213 


PR00308 


Type I antifreeze protein signature III 


PR00308C 2.79 1.00e-ll 729-738 
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2213 


EPB000906 


ZU5 domain 


IPB000906E 22.11 5.55e-ll 256-296 


2213 


PR01415 


Ankyrin repeat signature I 


PR01415A 12.73 6.46e-ll 259-271 
IPB000906D 23.89 6.59e-l 1 324-378 
PR01415A 12.73 7.11e-ll 192-204 
PR01415A 12.73 7.43e-ll 160-172 
PR00308B 3.38 9.53e-l 1 729-740 
PR00308A 3.72 5.19e-10 726-740 
IPB000906F 35.93 5.85e-10 202-255 


2213 


PR015H 


Kvl.4 voltage-gated K+ channel 
signature IV 


PR0151 ID 3.91 9.26e-l0 727-737 
PR01415B 10.23 5.88e-09 271-283 I 
IPB000906G 25.85 6.69e-09 338-386 
IPB000906A 22.49 7.84e-09 185-227 
PR00308A 3.72 9.1 le-09 727-741 i 
PR00308C 2.79 9.64e-09 727-736 


2214 


IPB000471 


"Interferon alpha, beta and delta 
family" 


IPB000471A 27.36 2.86e-34 56-109 


2214 


PR00266 


Interferon alpha and beta subunit 
signature I 


PR00266A 13.41 9.59e-14 78-90 


2219 


PR00405 


HIV Rev interacting protein 
signature II 


PR00405B 10.10 2.93e-17 290-307 
PR00405A 18.83 4.89e-14 271-290 


2219 


PROM 15 


Ankyrin repeat signature I 


PR01415A 12.73 1.32e-ll 419-431 
PR00405C 18.05 2.55e-09 31 1-332 


2220 


PR00405 


HIV Rev interacting protein 
signature II 


PR00405B 10.10 2.93e-17 290-307 
PR00405A 18.83 4.89e-14 271-290 


2220 


PR01415' 


Ankyrin repeat signature I 


PR01415A 12.73 1.32e-U 419-431 
PR00405C 18.05 2.55e-09 3 1 1-332 


2221 


PR00405 


HIV Rev interacting protein 
signature II 


PR00405B 10.10 2.93e-17 290-307 
PR00405A 18.83 4.89e-14 271-290 


2221 


PR01415 


Ankyrin repeat signature I 


PR01415A 12.73 1.32e-ll 419-431 
PR00405C 18.05 2.55e-09 311-332 


2222 


PR00405 


HIV Rev interacting protein 
signature II 


PR00405B 10.10 2.93e-l7 290-307 
FRU04U5A lo.o3 4.89e-14 271-29U 


2222 


PR01415 


Ankyrin repeat signature I 


T»Tinl A 1 C A 1 O 71 1 11 A 1 f\ A1 1 

PR01415A 12.73 1.32e-ll 419-431 
PR00405C 18.05 2.55e-09 31 1-332 


2223 


TOO/* A1 0 *7rt 

IPB002870 


Reprolysin family propeptide 


lPBOOZo/Ur lo.ol 2.35e-19 59-83 
lrr>UOzo/Ufc l l.yu J.3 /e-lo ZJ-Jj 


2223 


IPB000130 


"Neutral zinc metallopeptidases, 
zinc-binding region" 


IPB000130 5.86 1.86e-09 21-31 


2223 


PR00480 


Astacin family signature II 


PR00480B 14.35 3.45e-09 16-34 


2224 


IPB000329 


Uteroglobin family 


IPB000329A 11.99 3.57e-10 1-16 


2224 


PR00486 


Uteroglobin signature I 


PR00486A 6.53 9.03e-09 2-16 


2225 


IPB001073 


Complement Clq protein 


IPB001073A 22.14 6.55e-13 67-101 


2228 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 6.09e-ll 11-48 


2229 


IPB001759 


Pentaxin family 


IPB001759D 18.25 4.67e-33 471-509 


2229 


PR00895 


Pentaxin signature V 


PR00895E 12.84 4.19e-18 479-498 
PR00895D 14.46 2.38e-17 459-478 
PR00895C 12.82 3.18e-l7 432-450 
IPB001759C 13.49 4.30e-17 432-450 
IPB001759A 29.51 1.82e-14 175-209 
PR00895A 14.28 8.83e-13 366-380 
IPB001759E 18.14 5.34e-ll 521-535 
PR00895F 15.89 9.50e-l 1 498-512 


2229 


IPB002751 


Cobalamin synthesis CBIM 


IPB002751C 15.32 1.00e-08 50-79 


2235 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 8.71e-ll 73-98 


2239 


IPB000917 


Sulfatase 


IPB000917B 9.25 6.40e-13 103-1 13 
IPB000917A 9.52 5.26e-10 59-70 
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2240 


IPB000834 


"Zinc carboxypeptidases, 
carboxypeptidase A metalloprotease 
(M14) family" 


IPB000834B 13 51 2 50e-17 ^7-^1 


2240 


PR00765 


Carboxypeptidase A metalloprotease 
(M14) family signature II 


PR00765B 14.48 1.39e-15 33-47 
IPB000834C 17 20 2 80e-15 ltw» 19? 
IPB000834D 18.95 4.72e-12 133-159 
PR00765C 10.88 1.82e-10 1 13-121 


2241 


IPB000834 


"Zinc carboxypeptidases, 
carboxypeptidase A metalloprotease 
(M14) family" 


IPB000834B 13.51 2.50e-17 37-51' 


2241 


PR00765 


Carboxypeptidase A metalloprotease 
(M14) family signature II 


PR00765B 14 48 1 39e-15 33-47 
IPB000834C 17.20 2.80e-15 106-122 
IPB000834D 18.95 4.72e-12 133-159 
PR00765C 10.88 1.82e-10 113-121 


2242 


IPB002871 


Niflf-like N terminal domain 


IPB002871C 16.51 1.60e-33 81-113 
IPB002871D 14.11 6.87e-21 131-153 
IPB002871A 14.39 2.17e-17 35-50 
IPB002871B 12.43 6.79e-14 62-74 


2244 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 8.29e-ll 97-122 


2244 


PR00048 


C2H2-type zinc finger signature II 


PR00048B 5.52 9.50e-09 110-119 


2245 


IPB003527 


MAP kinase 


IPB003527D 21.53 5.58e-23 214-255 
IPB003527G 17.26 8.24e-22 314-351 
IPB003527C 14.70 3.05e-19 153-201 


2245 


IPB001245 


Tyrosine kinase catalytic domain 


IPB001245A 22.45 5.50e-17 161-201 


2245 


IPB000959 


POLO box duplicated region 


IPB000959B 15.68 7.19c-17 145-185 
IPB001245B 21.68 1.39e-15 221-259 


2245 


IPB001772 


Kinase associated domain 1 


IPB001772C 20.66 3.92e-14 156-186 


2245 


IPB000095 


PAK-box /P21-Rho-binding 


IPB000095C 13.36 7.91e-13 75-1 1 1 
IPB003527A 17.00 6.14e-12 55-80 


2245 


IPB000861 


PKN/rhophilin/rhotekin rho-binding 
repeat 


IPB000861G 13.73 7.44e-12 223-272 


2245 


IPB000961 


Protein kinase C-terminal domain 


IPB000961D 21.23 5.91e-ll 217-258 
IPB003527B 11.51 9.15e-ll 127-145 


2245 


PR00109 


Tyrosine kinase catalytic domain 
signature II 


PR00109B 11.07 9.10e-10 168-186 
IPB000961C 15.48 8.83e-09 168-202 


2248 


IPB001073 


Complement Clq protein 


IPB001073B 20.88 7.26e-29 42-76 


2248 


PR00007 


Complement C1Q domain signature I 


PR00007A 20.64 6.54e-20 35-61 
PR00007C 16.13 2.62e-15 110-131 
IPB001073C 13.07 1.87e-14 110-129 
PR00007B 15.63 3.13e-14 62-81 j 


2250 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 4.24e-10 325-362 i 


2251 


IPB001895 


Guanine-nucleotide dissociation 
stimulators CDC25 family 


IPB001895C 20.83 7.84e-30 1097-1132 
IPB001895D 18.68 1.00e-20 1194-1217 


2251 


IPB001331 


Guanine-nucleotide dissociation 
stimulators CDC24 family 


IPB001331C 16.09 1.00e-18 397-422 
IPB001895B 16.80 3.10e-15 1025-1045 
IPB001331B 19.33 7.00e-09 346-361 


2253 


IPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D 2.13 3.91e-09 202-226 


2253 


PR00169 


Potassium channel signature I 


PR00169A 17.48 5.50e-09 68-87 


2253 


1PB002360 


Involucrin 


IPB002360C 15.36 9.10e-09 198-239 


2253 


PR01083 


Lymphocyte-specific protein 
signature I 


PR01083A 8.60 9.61e-09 214-237 


2258 


IPB000433 


ZZ Zinc finger 


IPB000433 14.10 8.20e-18 23-39 


2258 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 7.86e-10 82-107 


2261 


IPB000135 


High mobility group proteins HMGl 
and HMG2 


IPB000135D2.13 5.91e-ll 889-913 
IPB000135D2.13 7.44e-ll 897-921 
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IPB000135D 2.13 7.85e-li 899-923 
IPB000135D 2.13 3.05e-10 895-919 
IPB000135D 2.13 5.1 le-10 893-917 
IPB000135D 2.13 8.14e-10 900-924 
IPB000135D 2 13 2 27e-09 888-91 ? 
IPB000135D 2.13 2.27e-09 894-918 
IPB000135D 2.13 2.36e-09 892-916 


2261 


PR00806 


Vinculin signature IV 


PR00806D 11.95 3.78e-09 577-592 
IPB000135D 2.13 3,91e-09 886-910 
IPB000135D 2.13 4.45e-09 901-925 
IPB000135D 2.13 6.36e-09 896-920 
IPB000135D 2.13 7.00e-09 891-915 
IPB000135D 2.13 7,18e-09 898-922 
IPB000135D 2.13 9.27e-09 932-956 


2262 


LPB000135 


High mobility group proteins HMG1 
and HMG2 


IPB000135D 2.13 6.43e-17 577-601 
IPB000135D 2.13 9.71e-17 576-600 
IPB000135D 2.13 4.90e-16 580-604 
IPB000135D 2. 13 578-602 
IPB000135D2.13 1.13e-15 581-605 
IPB000135D 2. 13 7.30e-15 579-603 
IPB000135D 2.13 7.45e-14 582-606 
IPB000135D 2.13 3.08e-13 575-599 
IPB000135D 2.13 8.50e-13 584-608 
IPB000135D 2 13 8 62e-H <5ftt-.fin7 
IPB000135D 2.13 9.08e-13 571-595 
IPB000135D 2.13 9.88e-13 586-610 
IPB000135D 2.13 1 65e- 12 574-598 
IPB000135D 2.13 4.36e-12 572-596 
IPB000135D 2.13 8.70e-12 585-609 
IPB000135D 2.13 8.36e-ll 587-611 
IPB000135D 2. 13 8.67e-l 1 573-597 
IPB000135D 2.13 4.42e-10 567-591 
1PB000135D 2.13 3.27e-09 570-594 


2262. 


IPB000637 


HMG-I and HMG-Y DNA-binding 
domain (A+T-hook) 


IPB000637B 14.21 4.27e-09 576-594 
IPB000135D 2.13 4.45e-09 569-593 
IPB000637B 14.21 5.09e-09 585-603 


2262 


IPB003403 


Herpesvirus immediate early protein 


IPB003403E 17.25 5.45e-09 577-604 
IPB000135D 2.13 7.18e-09 568-592 


2262 


IPB001422 


Neuromodulin (GAP-43) 


IPB001422C 16.82 8.54e-09 575-610 


2262 


IPB001580 


Calreticulin family 


IPB001580F 2.93 9.10e-09 590-599 


2265 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 8.71e-12 148-185 
IPB003006B 20.23 9.14e-12 441-478 
IPB003006B 20.23 1.00e-ll 248-285 


2265 


PR01536 


Interleukin-1 receptor type I and type 
II family signature HI 


PR01536C 19.92 9.23e-ll 547-570 
IPB003006B 20.23 6.40e-10 54-91 
IPB003006B 20.23 9.64e-10 540-577 
IPB003006B 20.23 8.62e-09 346-383 
PR01536C 19.92 9.19e-09 155-178 


2266 


IPB000967 


Zinc finger NF-X1 type 


IPB000967D 10.42 6.89e-09 716-751 


IZby 


IPB002048 


EF-hand family 


IPB002048 7.91 2.29e-ll 178-190 


2269 


PR00450 


Recoverin family signature III 


PR00450C 11.99 1.58e-09 64-85 
IPB002048 7.91 8.58e-09 105-117 


2270 


IPB003846 


Uncharacterized protein family 
UPF0061 


IPB003846E 18.41 1.00e-40 132-170 
IPB003846E 18.41 1.00e-40 511-549 
IPB003846F 24.67 9.36e-31 171-206 
IPB003846F 24.67 9.36e-31 550-585 
IPB003846C 15.01 4.05e-28 8-51 
IPB003846G 13.31 5.09e-09 264-274 
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IPB003846G 13.31 5.09e-09 643-653 


2271 


IPB003846 


Uncharacterized protein family 
UPF0061 

X WW A. 


IPB003846E 18.41 l.OOe-40 132-170 
IPB003846E 18.41 l.OOe-40 511-549 
IPB003846F 24.67 9.36e-31 171-206 
IPB003846F 24.67 9.36e-31 550-585 
IPB003846C 15.01 4.05e-28 8-51 
IPB003846G 13.31 5.09e-09 264-274 
IPB003846G 13.31 5.09e-09 643-653 


2272 


PR00237 


Rhodopsin-like GPCR superfamily 
signature VI 


PR00237F 14.34 l.67e-13 51-75 
PR00237G 19.23 4.00e-13 89-115 


2272 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276B 4.97 6.62e-13 1-12 
IPB000276D 9.40 4,52e-10 99-115 


2273 


PR00019 


Leucine-rich repeat signature I 


PR00019A 11.72 2.80e-13 89-102 
PR00019B 11.42 6.33e-10 86-99 


2274 


IPB000873 


AMP-dependent synthetase and 
ligase 


IPB000873A 11.08 6.06e-14 26-41 


2275 


IPB000595 


Cyclic nucleotide-binding domain 


IPB000595B 15.72 6.40e-U 136-159 


2276 


IPB000595 


Cyclic nucleotide-binding domain 
J a 


IPB000595B 15.72 6.40e-ll 136-159 


2281 


IPB003452 


Stem cell factor 


IPB003452C 13.68 8.56e-37 207-240 


2281 


IPB000808 


Mrp family 


IPB000808A 23.51 l.lle-12 16-60 


2281 


IPB003348 


Anion-transporting ATPase 


IPB003348A 20.06 6.60e-ll 21-58 


7282 


PRO0205 


Cadherin signature VI 


PR00205F 19.57 3.37e-17 55-81 
PR00205B 20.09 6.67e-16 113-142 
PR00205F 19.57 6.70e-13 166-192 
PR00205E 10.82 2.17e-10 111-124 


2282 


IPB002126 


Cadherin domain 


IPB002126A 14.68 6.09e-10 170-186 
PR00205A 17.38 3.12e-09 159-178 


2283 


PR00205 


Cadherin signature VI 


PR00205F 19.57 3.37e-17 55-81 
PR00205B 20.09 6.67e-16 113-142 
PR00205F 19.57 6.70e-13 166-192 ! 
PR00205E 10.82 2.17e-l0 111-124 


2283 


IPB002126 


Cadherin domain 


IPB002126A 14.68 6.09e-10 170-186 
PR00205A 17.38 3.12e-09 159-178 


2286 


IPB002027 


Amino acid permease 


IPB002027D 22.00 4.13e-25 248-287 
IPB002027C 19.67 2.74e-22 167-205 
IPB002027B 12.67 7.97e-12 103-122 


2287 


IPB000559 


Formate-tetrahydrofolate ligase 


IPB000559C 13.05 l.OOe-40 395-444 
IPB000559F 12.78 l.OOe-40 595-645 
IPB000559G 15.54 l.OOe-40 649-697 
IPB000559D 22.27 4.33e-37 496-536 
IPB000559E 17.08 7.39e-36 537-578 
IPB000559K 15.77 8.96e-35 875-910 
IPB000559B 12.60 2.88e-32 355-383 
IPB000559J 17.25 5.94e-32 842-874 
IPB000559H 20.31 2.72e-26 712-752 
IPB000559A 24.17 6.11e-25 310-354 
IPB000559I 15.05 6.35e-18 798-822 


2287 


PR00085 


Terrahydrofolate 
dehydrogenase/cyclohydrolase 
family signature III 


PR00085C 13.81 5.70e-14 112-133 
PR00085B 16.65 1. 23e-09 79-106 


2287 


IPB000672 


Tetrahydrofolate 
dehydrogenase/cyclohydrolase 


IPB000672C 28.03 6.83e-09 153-200 


2288 


IPB000560 


Histidine acid phosphatase 


IPB000560 17.02 7.86e-ll 391-413 


2290 


PR00390 


Phospholipase C signature I 


PR00390A 14.24 6.34e-20 2-20 


2292 


PR00245 


Olfactory receptor signature III 


PR00245C 14.65 5.26e-17 183-199 
PR00245E 8.96 2.73e-13 290-301 
PR00245B 13.73 1.39e-12 136-148 
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PR00245D 9.34 8.33e-ll 243-252 


2292 


IPB000276 


Rhodopsin-Iike GPCR superfamily 


IPB000276A 11.56 1.47e-10 125-136 
PR00245A 10.98 8.80e-10 99-110 
IPB000276D 9.40 9.61e-10 289-305 


2292 


PR00896 


Vasopressin receptor signature II 


PR00896B 9.36 5.50e-09 62-73 


2292 


PR00534 


Melanocortin receptor family 
signature I 


PR00534A 12.77-5.70*09 58-70 


2292 


PR00237 


Rhodopsin-like GPCR superfamily 
signature II 


PR00237B 12.45 7.16e-09 66-87 
PR00237E 13.03 8.20e-09 206-229 


2292 


IPB003211 


AmiS/Urel family transporter 


IPB00321 1 A 15.05 9.43e-09 35-74 


2293 


IPB003367 


Thrombospondin type 3 repeat 


IPB003367E 16.82 1.00e-40 35-82 
IPB003367F 16.21 1.00e-40 93-142 
IPB003367G 17.08 1.00e-40 143-184 
IPB003367H 15.25 1.00e-40 185-217 
IPB003367J 18.60 1.00e-40 247-288 
IPB003367L 21.71 1.00e-40 313-364 
IPB003367I 12.15 3.14e-37 218-246 
IPB003367K 16.35 9.10e-30 289-312 
IPB003367F 16.21 5.83e-21 53-102 
IPB003367C 20.73 1.54e-l9 38-88 
IPB003367D 18.41 9.44e- 19 53-95 

TPRfi0^^7r> 15? A\ 5 55*»-17 
LrD\J\JjjO/U xO.*rl U-J/ 

IPB003367F 16.21 2.74e-14 15-64 1 
IPB003367C 20.73 9.27e-13 78-128 
IPB003367E 16.82 2.82e-12 12-59 
IPR00n67F 16 82 4 98e- 12 75-122 
IPB003367C 20 73 5 96e-l 1 23-73 
IPB003367C 20.73 2.38e-10 101-151 
IPB003367C 20.73 6.35e-10 61-111 
IPB003367E 16.82 8.88e-10 73-120 


2294 


IPB001978 


Troponin 


IPB001978A 18.18 8.89e-09 102-137 


79CK 


TPRfinm no 


PTP n^ntirlf* h-oncnnrfprc fPTR9^ 


TPR000109D 25 09 6 67e-32 414-481 
IPB000109B 29.23 4.18e-23 46-98 
IPB000109A 10.85 3.79e-15 23-41 
IPB000109C 8.21 7.00e-14 174-186 


2295 


PR01471 

L IWJ I *T / i 


T-Ti^tnminp W\ rpcpntnr <5iQnfltnrf* TI 

XXldlullllllw llJ lWtsCfJLVJl OlglltltLli t- XX 


PR01471B 12 38 9 63e-09 3-21 


2297 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 9.18e-21 113-138 
IPB000822 14.67 9.3 le-18 29-54 
IPB000822 14.67 9.3 le-18 141-166 
IPB000822 14.67 5.20e-16 57-82 
IPB000822 14.67 5.20e-16 85-110 






v^znz-iypc zinc linger bigndiuic i 


PR00048A 9 94 4 46e-14 138-151 
IPB000822 14.67 1.50e-13 1-26 
PR00048A 9 94 5 76e-12 110-123 
PR00048A 9.94 1.00e-ll 26-39 


2297 


TPB001275 


Dl\/f DNA hinHiiuy Hnmain 


IPB001275 19.17 4.21e-ll 17-56 
PR00048A 9.94 4.79e-ll 54-67 
IPB001275 19.17 2.22e-10 73-112 
PR00048B 5.52 5.50e-10 126-135 
IPB001275 19.17 9.15e-10 45-84 
PR00048A 9.94 l.38e-09 82-95 


2297 


IPB001222 


TFIIS zinc ribbon domain 


IPB001222 24.63 5.69e-09 1-37 
IPB001222 24.63 9.49e-09 29-65 


2299 


IPB003137 


Protease associated (PA) domain 


IPB003137 22.40 2.50e-19 188-218 


2303 


IPB000433 


ZZ Zinc finger 


IPB000433 14.10 8.20e-18 23-39 


2303 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 7.86e-10 82-107 
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2306 


IPB001039 


"Major histocompatibility complex 
protein, Class I" 


IPB001039A 17.17 1.00e-40 22-75 
IPB001039B 27.55 1.00e-40 103-154 
IPB001039C 19.82 1.00e-40 184-237 
IPB001039D 16.49 1.00e-40 262-316 


2306 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 4.60e-29 268-305 
IPB003006A 17.51 6.14e-20 231-253 


2306 


IPB000353 


"Class II histocompatibility antigen, 
beta chain, beta-1 domain" 


IPB000353B 19.16 9.87e-l4 210-259 


2306 


IPB003363 


Glycoprotein GG/GX 


IPB003363E 13.35 2.94e-ll 315-347 
IPB000353C 20.11 4.68e-10 261-315 


2312 


IPB001359 


Synapsin 


IPB001359H 22.58 5.54e-09 98-148 


2312 


IPB003403 


Herpesvirus immediate early protein 


IPB003403A21.25 6.18e-09 130-152 


2313 


PR01382 


Claudin-9 signature IV 


PR01382D 12.38 I. lie- 16 205-217 


2313 


IPB000729 


PMP-22/EMP/MP20 family 


IPB000729D 18.96 2.96e-16 164-191 
IPB000729C 37.83 7.91e-16 84-136 
PR01382A 12.00 1.17e-l5 41-51 


2313 


PR01077 


Claudin signature III 


PR01077C 13.60 l.47e-14 67-77 
PR01382C 5.67 5.14e-l3 194-203 
PR01382B 7.06 1.12e-l2 95-104 
PR01077B 14.12 1.00e-10 53-59 
PR01077D 11.20 4.00e-10 150-156 
PR01077A 9.72 8.16e-09 25-34 


2317 


IPB001245 


Tyrosine kinase catalytic domain 


IPB00 1245 A 22.45 7.60e-28 129-169 


2317 


IPB001772 


Kinase associated domain I 


IPB001772C 20.66 9.25e-24 124-154 


2317 


IPB000961 


Protein kinase C-terminal domain 


IPB000961C 15.48 2.13e-22 136-170 
IPB001772D 21.67 4.55e-17 196-235 


2317 


IPB000959 


POLO box duplicated region 


IPB000959B 15.68 8.60e-17 113-153 


2317 


IPB000095 


PAK-box /P21-Rho-binding 


IPB000095E 17.62 9.03e-17 137-182 


2317 


IPB003527 


MAP kinase 


IPB003527C 14.70 1.95e-I6 121-169 


2317 


IPB000861 


PKN/rhophilin/rhotekin rho-binding 
repeat 


EPB000861F 16.50 1.55e-15 130-184 


2317 


IPB000494 


"Epidermal growth-factor receptor 
(EGFR), L domain" 


IPB000494C 24.40 7.35e-14 123-169 
IPB000959D 27.01 5.95e-13 236-288 
IPB000961D 21.23 7.19e-13 185-226 
IPB001245B 21.68 8.96e-13 189-227 
IPB001772E 24.88 8.96e-12 243-282 
IPB003527A 17.00 7.85e-ll 28-53 
IPB001772A 13.64 2.29e-10 19-50 
IPB003527G 17.26 1.30e-09 255-292 


2317 


PR00109 


Tyrosine kinase catalytic domain 
signature II 


PR00109B 11.07 4.23e-09 136-154 
IPB003527D 21.53 4.60e-09 182-223 


2318 


PR01254 


Prostaglandin D synthase signature I 


PR01254A 12.32 3.37e-29 51-74 
PR01254D 13.80 7.97e-27 129-152 
PR01254C 10.60 4.68e-22 94-1 12 
PR01254F 10.08 7.58e-21 182-200 
PR01254E 14.07 1.00e-18 165-179 


2318 


PR00179 


Lipocalin signature II 


PR00179B 7.67 5.26e-13 140-152 
PR00179C 17.26 3.84e-12 168-183 
PR01254B 12.05 9.04e-12 77-87 


2318 


PR01275 


Neutrophil gelatinase lipocalin 
signature V 


PR01275E 6.38 1.72e-10 135-153 
PR00179A 13.97 3.25e-10 57-69 


2318 


PR01215 


Alpha- 1 -microglobulin signature IV 


PR01215D 12.88 9.78e-10 131-150 


2318 


IPB000566 


Lipocalin and cytosolic fatty-acid 
binding protein 


IPB000566B 8.91 1.47e-09 140-150 


2318 


PR01174 


Retinol binding protein signature VI 


PR0U74F 11.76 3.96e-09 139-155 


2318 


PR01273 


Invertebrate colouration protein 
signature IV 


PR01273D 11.48 4.41e-09 140-154 
PR01275B 9.02 8.57e-09 59-69 



WO 2004/080148 



PCT/US2003/030720 



455 
TABLE 3B 



2320 


IPB001464 


Annexin family 


IPB001464D 25.42 1.00e-40 177-231 
IPB001464B 28.31 1.90e-36 47-99 . 
IPB001464C 24.68 6.40e-30 1 10-149 


2320 


PR00196 


Annexin family signature IV 


PR00196D 21.41 3.81e-22 115-141 

PR00196C 9.01 9.67e-22 32-53 

PR00196E 9.70 5.22e-21 195-215 ; 


2320 


PR00201 


Annexin type V signature VII 


PR00201G 12.46 1.63e-20 195-221 


2320 


PR00199 


Annexin type III signature VI 


PR00199F 15.67 5.10e-18 115-141 
IPB001464B 28.31 3,86e-l7 131-183 
PR00196C 9.01 5.70e-17 191-212 


2320 


PR00200 


Annexin type IV signature VII 


PR00200G 9.20 7.67e-17 195-221 
IPB001464D 25.42 8.71e-17 18-72 
PR00199D 4.74 9.87e-17 191-212 
PR00199G 9.85 4.45e-16 196-221 
PR00196B 11.03 9.31e-16 5-21 


■2320 


PR00197 


Annexin type I signature IV 


PR00197D 7.59 1.73e-15 32-53 
PR00199D 4.74 2.17e-15 32-53 
IPB001464A 31.17 3.83e-l5 47-101 


2320 


PR00198 


Annexin type II signature IV 


PR00198D 7.41 3.89e-15 32-53 
PR00197F9.40 6.80e-15 195-215 
PR00200E 8.88 9.02e-15 32-53 


2320 


PR00202 


Annexin type VI signature VII 


PR00202G 8.03 9.04e-15 195-221 
PR00197D 7.59 1.00e-14 191-212 
IPB001464A 31.17 1.85e-14 131-185 
PR00198D 7.41 2.38e-14 191-212 
PR00198G7.70 3.44e-13 195-215 
PR00201D 8.61 3.51e-13 32-53 
PR00200F 14.58 3.53e-13 1 15-141 P 


2321 


IPB000175 


Sodiummeurotransmitter symporter 
family 


IPB000175C 15.09 1.00e-40 56-107 
IPB000175D 23.45 1.00e-40 122-174 
IPB000175F 25.63 4.50e-38 310-349 
IPB000175E 21.88 5.95e-35 215-254 


2321 


PR00176 


Sodium/chloride neurotransmitter 
symporter signature V 


PR00176E 11.14 2.00e-24 165-185 
PR00176G 13.12 3.77e-22 301-321 


2321 


PR01195 


GAT-1 GAB A neurotransmitter 
transporter signature II 


PR01195B 13.58 6.60e-22 38-55 
PR01 195D 9.00 3.75e-21 426-443 
PR00176Fll.il 1.36e-19 219-238 
IPB000175G 16.18 5.13e-19 371-393 
PR00176D 8.96 6.48e-18 83-100 
PR00176H 15.94 7.63e-18 341-361 
PR01195C 15.62 1.14e-13 191-200 


2323 


IPB001863 


Glypican 


IPB001863A 13.95 5.03e-15 56-71 


2323 


PR00436 


Interleukin-8 signature I 


PR00436A 15.20 7.91e-10 1-24 


2328 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599L 18.66 4.15e-28 59-86 


2328 


IPB001134 


"Netrin, C-terminus" 


IPB001134C 17.82 4.13e-13 72-86 
IPB001599K8.15 1.46e-10 29-40 


2329 


IPB001599 


Alpha-2-macroglobuIin family 


IPB001599L 18.66 4.15e-28 59-86 


2329 


IPB001134 


"Netrin, C-terminus" 


IPB001134C 17.82 4.13e-13 72-86 
IPB001599K8.15 1.46e-I0 29-40 


2330 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599L 18.66 4.15e-28 59-86 


2330 


IPB001134 


"Netrin, C-terminus" 


IPB001134C 17.82 4.13e-13 72-86 
IPB001599K8.15 1.46e-10 29-40 


2331 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599L 18.66 4.15e-28 59-86 


2331 


1PB001134 


"Netrin, C-terminus" 


IPB001134C 17.82 4.13e-13 72-86 
IPB001599K8.15 1.46e-10 29-40 


2332 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599L 18.66 4.15e-28 59-86 


2332 


IPB00U34 


"Netrin, C-terminus" 


IPB001134C 17.82 4.13e-13 72-86 
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IPB001599K 8.15 1.46e- 10 29-40 


2334 


PR00010 


Type II EGF-like signature IE 


PR00010C 6.98 1.37e-ll 7-17 


2334 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 5.50e-10 2-17 


2334 


IPB000033 


"Low-density lipoprotein (ldl) 
receptor, YWTD repeat" 


IPB000033B 7.05 8.26e-10 7-17 


-j jj 


FPB00049'? 


Protamine 2 (PRM2\ 


IPB000492B 5 26 7 l6e-09 62-96 


2336 


PR00014 


Fibronectin type III repeat signature 
IV 


PR00014D 15.12 5.74e-10 215-229 


2339 


IPB002494 


"Keratin, high sulfur B2 protein" 


IPB002494C 14.46 8.36e-35 39-82 
IPB002494C 14.46 6,55e-31 83-126 
IPB002494C 14.46 9.46e-26 93-136 

\PU.Ci(\1AQAO 14 4fi 4 R4fO<i 40 QO 

IPB002494C 14.46 8.59e-24 44-87 
IPB002494C 14.46 9.38e-23 73-1 16 
IPB002494C 14.46 2.73e-22 98-1 


2339 


IPB000359 


Cystine-knot domain 


IPB000359B 19.26 9.57e-13 43-61 

IPB002494A 12.44 1.56e-12 61-94 
IPB002494B 10.58 2.50e-12 70-84 
IPB002494B 10.58 2.50e-12 114-128 
IPB002494C 14.46 5.41e-12 53-96 


2339 


IPB001271 


Mammalian defensin 


IPB001271 19.97 7.95e-12 77-105 
IPB001271 19.97 9.59e-12 38-66 
IPRflO?494R10 1 ?Rp-11 4S-S0 
IPB002494B 10 58 1 28e-1 1 89-103 
IPB002494A 12.44 4.00e-ll 75-108 


2339 


IPB000006 


"Vertebrate metallothionein, family 
1" 


IPB000006 13.41 4. 10e-ll 85-130 
IPB001271 19.97 5.13e-ll 116-144 
IPB000006 13.41 6.80e-ll 59-104 
IPB000359B 19.26 7.48e-ll 122-140 
IPB000006 13.41 8.00e-ll 89-134 
IPB002494A 12 44 8 18e-l 1 65-98 
IPB002494C 14.46 1.61e-10 102- 


2339 


IPB000967 


Zinc finger NF-Xl type 


IPB000967E 21.88 1.56e-09 70-110 


2339 


IPB001762 


Disintegrin 


IPB001762A 23.93 1.88e-09 58-98 
IPB001271 19.97 2.15e-09 117-145 
IPB002494A 12.44 2.55e-09 81-114 
IPB002494A 12.44 3. 13e-09 60-93 
TPB002494A 1 2 44 ^ 9 V-09 47-80 
IPB002494A 12.44 3.23e-09 91-124 
IPB002494A 12.44 3.23e-09 96-1 


2340 


IPB002494 


"Keratin, high sulfur B2 protein" 


IPB002494C 14.46 8.36e-35 39-82 
IPB002494C 14.46 6.55e-31 83-126 
IPB002494C 14.46 9.46e-26 93-136 

IPB002494C 14.46 8.59e-24 44-87 
IPB002494C 14.46 9.38e-23 73-116 
IPB002494C 14.46 2.73e-22 98-1 


2340 


IPB000359 


Cystine-knot domain 


IPB000359B 19.26 9.57e-13 43-61 
IPB000359B 19.26 9.57e-13 87-105 
IPB002494A 12.44 1.56e-12 61-94 
IPB002494B 10.58 2.50e-12 70-84 
IPB002494B 10.58 2.50e-12 114-128 
IPB002494C 14.46 5.41e-12 53-96 


2340 


IPB001271 


Mammalian defensin 


IPB001271 19.97 7.95e-12 77-105 
IPB001271 19.97 9.59e-12 38-66 
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IPB002494B 10.58 1.28e-ll 45-59 
IPB002494B 10.58 1.28e-ll 89-103 
IPB002494A 12.44 4.00e-ll 75-108 


2340 


IPB000006 


"Vertebrate metallothionein, family 
1" 


IPB000006 13.41 4.10e-ll 85-130 
IPB001271 19.97 5.13e-ll 116-144 
IPB000006 13.41 6.80e-ll 59-104 
IPB000359B 19.26 7.48e-U 122-140 
IPB000006 13.41 8.00e-ll 89-134 
IPB002494A 12.44 8.18e-l 1 65-98 
IPB002494C 14.46 1.61e-10 102- 


2340 


IPB000967 


Zinc finger NF-X1 type 


IPB000967E 21.88 1.56e-09 70-110 


2340 


IPB001762 


Disintegrin 


IPB001762A 23.93 1.88e-09 58-98 
IPB001271 19.97 2. 15e-09 117-145 
IPB002494A 12.44 2.55e-09 81-114 
IPB002494A 12.44 3.13e-09 60-93 
IPB002494A 12.44 3.23e-09 47-80 
IPB002494A 12.44 3.23e-09 91-124 
IPB002494A 12.44 3.23e-09 96-1 


2341 


IPB002494 


"Keratin, high sulfur B2 protein" 


IPB002494C 14.46 8.36e-35 39-82 
IPB002494C 14.46 6.55e-31 83-126 
IPB002494C 14.46 9.46e-26 93-136 
IPB002494C 14.46 4.84e-25 49-92 
IPB002494C 14.46 8.59e-24 44-87 
IPB002494C 14.46 9.38e-23 73-116 
IPB002494C 14.46 2.73e-22 98-1 


2341 


IPB000359 


Cystine-knot domain 


IPB000359B 19.26 9.57e-13 43-61 
IPB000359B 19.26 9.57e-13 87-105 
IPB002494A 12.44 1.56e-12 61-94 
IPB002494B 10.58 2.50e-12 70-84 
IPB002494B 10.58 2.50e-12 114-128 
IPB002494C 14.46 5.41e-12 53-96 


2341 


IPB001271 


Mammalian defensin 


IPB001271 19.97 7.95e-12 77-105 | 
IPB001271 19.97 9.59e-12 38-66 
IPB002494B 10.58 1.28e-ll 45-59 
IPB002494B 10.58 1.28e-ll 89-103 
IPB002494A 12.44 4.00e-ll 75-108 


2341 


IPB000006 


"Vertebrate metallothionein, family 

r 


IPB000006 13.41 4.10e-ll 85-130 
IPB001271 19.97 5.13e-ll 116-144 
IPB000006 13.41 6.80e-ll 59-104 
IPB000359B 19.26 7.48e-ll 122-140 
IPB000006 13.41 8.00e-ll 89-134 
IPB002494A 12.44 8.18e-ll 65-98 
IPB002494C 14.46 1.61e-10 102- ! 


2341 


IPB000967 


Zinc finger NF-X1 type 


IPB000967E 21.88 1.56e-09 70-110 


2341 


IPB001762 


Disintegrin 


IPB001762A 23.93 I.88e-09 58-98 
IPB001271 19.97 2.15e-09 117-145 
IPB002494A 12.44 2.55e-09 81-114 
IPB002494A 12.44 3.13e-09 60-93 
IPB002494A 12.44 3.23e-09 47-80 
IPB002494A 12.44 3. 23e-09 91-124 
IPB002494A 12.44 3.23e-09 96-1 


2342 


IPB000734 


Lipase 


IPB000734 10.25 8. 12e-09 224-238 


2343 


IPB000734 


Lipase 


IPB000734 10.25 8.12e-09 224-238 


2344 


PR01223 


Bride of sevenless protein signature 
VI 


PR01223F4.19 9.78e-ll 205-229 


2344 


PR00354 


7Fe ferredoxin signature III 


PR00354C 6.24 8.06e-09 260-277 


2345 


IPB001304 


C-type lectin domain 


IPB00I304A 17.98 8.04e-14 90-114 
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2345 


PR00356 


Type II antifreeze protein signature 

vn 


PR00356G 10.21 8.15e-09 201-214 


2346 


IPB001304 


C-type lectin domain 


IPB001304A 17.98 8.04e-14 90-114 


2346 


PR00356 ! 


Type II antifreeze protein signature 
VII 


PR00356G 10.21 8. 15e-09 201-214 


2347 


PR00245 


Olfactory receptor signature V 


PR00245E 8.96 5.l5e-16 341-352 
PR00245E 8.96 5.15e-16 659-670 
PR00245B 13.73 3.77e-15 187-199 ! 
PR00245C 14.65 2.73e-14 234-250 j 
PR00245C 14.65 8.27e-l 4 552-568 
PR00245D 9.34 2.59e-13 294-303 
PR00245D 9.34 2.59e-13 612-621 
PR00245B 13.73 1.39e- 12 505-517 


2347 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11.56 7.00e-12 176-187 - 
IPB000276A 11.56 7,00e-12 494-505 
PR00245A 10.98 8.77e- 12 468-479 
PR00245A 10.98 1.72e-l 1 150-161 
IPB000276D 9.40 6.09e-10 340-356 


2347 


PR00237 


Rhodopsin4ike GPCR superfamily 
signature II 


PR00237B 12.45 7.55e- 1 0435-456 
IPB000276D 9.40 7.65e-10 658-674 
PR00237A9.81 1 .84e-09 402-426 


2347 | 


PR00534 


Melanocortin receptor family 
signature I 


PR00534A 12.77 2.83e-09 109-121 
PR00534A 12.77 2.83e-09 427-439 
PR00237C 14.77 3.86e-09 162-184 
PR00237B 12.45 6.92e-09 117-138 
PR00237A 9.81 8.31e-09 84-108 


2348 


PR00346 


Tissue factor signature VIII 


PR00346H 10.74 8.18e-09 76-99 


2350 


PR00457 


Animal haem peroxidase signature 
VII 


PR00457G 14.17 4.48e-14 144-164 
PR00457H 14.82 5.85e-13 215-229 
PR00457F 14.42 6.32e-12 17-27 


2351 


PR00457 


Animal haem peroxidase signature 
VII 


PR00457G 14.17 4.48e-14 144-164 
PR00457H 14.82 5.85e- 13 215-229 
PR00457F 14.42 6.32e-12 17-27 


2354 


IPB000623 


Shikimate kinase 


IPB000623A 19.06 6.27e-09 55-84 


2360 


IPB001841 


RING finger 


IPB001841 10.69 l.95e-09 159-168 


2372 


IPB000421 


Coagulation factor 5/8 type C 
domain (FA58C) 


IPB000421B 20.70 1.36e-14 129-149 


2373 


IPB000421 


Coagulation factor 5/8 type C 
domain (FA58Q 


IPB000421B 20.70 1.36e-14 129-149 


2375 


IPB001245 


Tyrosine kinase catalytic domain 


IPB001245B 21.68 3.45e-17 60-98 


2375 


IPB003527 ■ 


MAP kinase 


IPB003527D 21.53 4.48e-15 53-94 


2375 


IPB000959 


POLO box duplicated region 


IPB000959C 23.49 4.21e-12 35-87 


2375 


IPB000861 


' PKN/rhophilin/rhotekin rho-binding 
repeat 


IPB000861G 13.73 5.59e-12 62-111 


2375 


IPB000095 


P AK-box /P2 1 -Rho-binding 


IPB000095F 16.47 2.26e-ll 64-118 


2375 


IPB000961 


Protein kinase C-terminal domain 


IPB000961D 21.23 1.61e-10 56-97 


2376 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881A 8.72 2.20e-09 41-50 [ 


2376 


PR00873 


Echinoidea (sea urchin) 
metallothionein signature IV 


PR00873D 8.25 8.11e-09 41-59 


2377 


PR00402 


Tec/Btk domain signature I 


PR00402A 20.14 8.15e-15 94-113 
PR00402B 12.26 4.69e-13 113-125 
PR00402C 13.13 8.03e-12 125-138 


2379 


IPB003886 


Extracellular domain in nidogen 


IPB003886D 13.91 8.57e-15 46-65 


2379 


IPB000152 


Aspartic acid and asparagine 
hydroxylation site 


IPB000152 8.86 9.05e-14 1-16 
IPB000152 8.86 5.91e-13 46-61 


2379 


IPB001881 


Calcium-binding EGF-like domain 


IPB001881B 12.28 9.25e-13 1-12 


2379 


PR01217 


Proline rich extensin signature VII 


PR01217G4.02 4.20e-ll 125-150 
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2379 


IPB000033 


"Low-density lipoprotein (Idl) 
receptor, YWTD repeat" 


IPB000033B 7.05 4.96e-ll 51-61 
IPB001881B 12.28 i.00e-10 46-57 


2379 


PR00010 


Type n EGF-like signature III 


PR00010C6.98 1.66e-09 51-61 


2379 


PR00049 


WilnVs tumour protein signature IV 


PR00049D 0.00 3.29e-09 133-147 
IPB000033B 7.05 3.84e-09 6-16 




FPB000561 

1 I UuvvJU J 


EGF-like domain 


IPB000561 4 89 6 79e-09 55-63 
PR00010C 6.98 7.80e-09 6-16 




PR00910 


L,uteovtru<? ORR6 nrotein «?ipnatiire T 


PR00910A 2 74 8 71e-09 133-145 
PR00910A2.74 9.46e-09 131-143 




PR0024S 


Olfaotfirv rpppntm* Qicmcitirrp TIT 
\_/ixaviuiy ic^cjjiui .Mguaimc in 


PR00245C 14 65 9 53e-17 218-234 

X I\VV*(— J\/ iTlW *J\* I / 1 O £* Jt 


2385 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11.56 9.25e-14 160-171 
PR00245D 9 34 1 53e-13 278-287 
PR00245E 8 96 6 81e-12 325-336 
PR00245B 13.73 1.00e-l0 171-183 
IPB000276D 9.40 3.08e-09 324-340 


2385 ! 


PR00237 


Rhodopsin-like GPCR superfamily 
signature V 


PR00237E 13.03 3.83e-09 241-264 






\Ap\vknf\c*t\rt in rpf^f^nfnr familv 
lVlwlollV/vul till ldsG^JlA/1 lailllljr 

signature I 


PR00534A 12 77 5 17e-09 93-105 
PR00237C 14.77 5.91e-09 146-168 


91RS 


X i\UU O ^ V/ 


VacnnrpQQin rpppntnr cionatnrp IT 

V ajUUl woo 111 l^UtUluJ olgllAlUl& XX 


PR00896B 9 36 7 23e-09 97-108 
PR00237G 19.23 1.00e-08 314-340 


9386 


PR00245 


Olfartfirv rpppntnr sipnature Til 


PR00245C 14 65 9 53e-17 218-234 


2386 


IPB000276 


Rhodopsin-like GPCR superfamily 


IPB000276A 11.56 9,25e-14 160-171 
PR00245D 9.34 l,53e-13 278-287 
PR00245E 8.96 6.81e-12 325-336 
PR00245B 13.73 1.00e-10 171-183 
IPB000276D 9.40 3.08e-09 324-340 


2386 


PR00237 


RhodoDsin-like GPCR sunerfamilv 
signature V 


PR00237E 13.03 3.83e-09 241-264 


2386 


PR00534 

X 1WUJ *J\ 


Mplanneortiri recentor familv 
signature I 


PR00534A 12.77 5.17e-09 93-105 
PR00237C 14.77 5.91e-09 146-168 


2386 


PR00896 


Va<innre^<»in recentor signature IT 


PR00896B 9.36 7.23e-09 97-108 
PR00237G 19.23 1.00e-08 314-340 


2389 


PR01360 


Interleukin-1 receptor antagonist 
precursor IL- IRA signature VI 


PR01360F 14.44 3.11e-12 145-163 
PR01360C 10.33 4.84e-ll 86-103 


2389 


IPB000975 


Interleukin-1 


IPB000975D 24.45 5.55e-09 80-1 19 | 
IPB000975E 28.12 9.80e-09 124-163 


2389 


PR00264 


Interleukin-1 precursor family 
signature I 


PR00264A 18.63 1.00e-08 83-103 


2390 


IPB001664 


Intermediate filament proteins 


IPB001664B 17.44 9.69e-22 102-141 
IPB001664C 11.32 4.38e-18 159-186 


2390 


PRO 1248 


Tvnp T Vprafin <;iOTiatiH*e TI 

X j\J^ * R.CICH111 dl£llalUlV XX 


PR01248B 8.42 6.37e-15 94-117 
PR01248C 10.07 9.23e-14 148-168 
PR01248A 8.12 4.31e-ll 73-86 


2390 


PR0 1 177 

X 1WJ lift 


A/fpfahntrnnjip cramma-aminnhiitvTi'n 

acid type Bl receptor signature X 


PR01177J6 10 496e-10 11-29 


2393 


PR01276 


Type II keratin signature III 


PR01276C 10.16 7.32e-ll 67-80 
PR01276B 9.79 5.96e-10 20-32 


2394 


IPB001818 


Matrixin 


IPB001818C 24.38 7.43e-35 54-99 
IPB001818B 26.48 8.l5e-25 9-50 
IPB001818C 24.38 1.55e-21 96-141 


2394 


PR00138 


Matrixin signature III 


PR00138C 20.07 1.78e-16 52-80 
PR00138B 14.84 5.21e-10 28-43 
PR00138C 20.07 9.18e-10 94-122 


2395 


IPB001818 


Matrixin 


IPB001818C 24.38 7.43e-35 54-99 
IPB001818B 26.48 8.15e-25 9-50 
IPB001818C 24.38 1.55e-21 96-141 
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2395 


PR00138 


Matrixin signature III 


PR00138C 20.07 1.78e-16 52-80 
PR00138B 14.84 5.21e-10 28-43 
PR00138C 20.07 9.18e-10 94-122 


2396 


PR00049 


Wilm's tumour protein signature IV 


PR00049D 0.00 2.07e-09 10-24 


2396 


IPB002000 


Lysosome-associated membrane 


IPB002000D 5.87 5.25e-09 12-25 


2405 


TPB000364 


Ph n^nhnpnnlnvriivate carboxvkiriase 

(GTP) 


IPB000364M 26 08 1 40e-09 6^3-657 


2406 


rPB001304 


C-type lectin domain 


IPB0O13O4A 17.98 6.50e-17 155-179 


2412 


IPB001559 


Phosphodiesterase family 


IPB001559F 24.25 1.49e-25 343-377 
IPB001559D 19 17 5 00e-20 207-233 
IPB001559C 16.25 5.34e-16 172-193 
IPB001559E 16.18 5.35e-16 245-263 
IPB001559A 10 81 1 23e-ll 49-60 
IPB001559B 12 98 8 50e-10 153-163 


2412 

Art L4, 


IPB000890 


Arptatp anrl hutvrate kinase 


IPB000890E 8 17 8 66e-09 336-349 


2414 


PR00049 


Wilm's tumour protein signature IV 
■ 


PR00049D 0.00 9.24e-ll 410-424 
PR00049D 0.00 2.07e-10 412-426 
PR00049D 0.00 2.14e-10 41 1-425 
PR00049D 0.00 2.14e-10 414-428 


2414 


IPB000996 


Clathrin light chain 


IPB000996B 20.25 8.98e-10 342-394 
PR00049D 0.00 9.43e-10 408-422 
PR00049D 0.00 9/71e-10 409-423 


2414 


PR01217 


Proline rich extensin signature II 


PR01217B 4.82 7.09e-09 412-428 






1 UUUl UwlllCXilL 


IPB002999B 7 50 7 55e-09 412-420 


2414 


PR01471 


Histamine H3 receptor signature V 


PR01471E 5.41 8.92e-09 41 1-426 
PR00049D 0.00 8.93e-09 413-427 


2415 


PR01372 


Yersinia virulence determinant YopE 

nmtfin <5ifmatiirp IT 


PR01372B 7.73 4.87e-09 21-38 


2420 


IPB003817 


Phosphatidylserine decarboxylase 


IPB003817D 23.34 8.71e-25 194-220 
IPB003817C 10.66 4.00e-15 172-184 
IPB003817E 13.21 2.67e- 14 283-299 
IPB003817A 12.64 4.15e-13 77-91 
IPB003817B 13.04 4.00e-09 101-109 


2425 


IPB002469 


"Dipeptidyl peptidase IV, N- 
terminus" 


IPB002469J 8.97 3.52e-12 17-33 


2426 


IPB002469 


"Dipeptidyl peptidase IV, N- 
terminus" 


IPB002469J 8.97 3.52e-12 17-33 


2427 


IPB002469 


"Dipeptidyl peptidase IV, N- 
terminus" 


IPB002469J 8.97 3.52e-12 17-33 


9490 




ZjUJ U Lull dill 


FPB000Q06A 22 49 6 14e-19 145-187 
IPB000906F 35.93 3.09e- 16 63-1 16 
IPB000906F 35.93 7.91e-16 96-149 


2429 


PR01415 


Ankyrin repeat signature I 


PR01415A 12.73 3.70e-15 252-264 
TPR000906A 22 49 1 71e-14 46-88 
IPB000906F 35.93 L00e-12 346-399 
IPB000906A 22.49 5.66e-l2 112-154 
IPB000906G 25.85 9.36e-12 53-101 
PR01415A 12,73 1.00e-ll 53-65 
PR01415A 12.73 2.61e-ll 119-13 


2430 


PR00834 


HtrA/DegQ protease family signature 
III 


PR00834C 15.48 7.35e-19 148-172 
PR00834D 11.75 7.39e-17 186-203 
PR00834B 10.17 3.25e-l3 107-127 
PR00834E 13.43 6.03e-12 208-225 


2430 


IPB000126 


"Serine proteases, V8 family" 


1PB000126B 12.50 6.81e-12 191-207 
PR00834A 8.79 1.44e-ll 86-98 
PR00834F 11.11 1.53e-09 301-313 
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IPB000126A 11.75 9.83e-09 78-93 


2432 


PR00505 


D12 class N6 adenine-specific DNA 
methyltransferase signature I 


PR00505A 15.44 3.67e-12 39-55 
PR00505B 11.79 8.88e-12 60-74 


2433 


PR00179 


Lipocalin signature II 


PR00179B7.672.35e-09 15-27 
PR00179C 17.26 6.70e-09 42-57 


2433 


PR01174 


Retinol binding protein signature VI 


PR01174F 11.76 6.82e-09 14-30 


2433 


PR01254 


Prostaglandin D synthase signature V 


PR01254E 14.07 8.23e-09 39-53 


2434 


PRO 1042 


Aspartyl-tRNA synthetase signature 
II 


PR01042B 12.76 4.69e-l I 260-273 
PR01042A 9.01 9.77e-10 244-256 


2434 


IPB002106 


Aminoacyl-transfer RNA synthetases 
class-II 


IPB002106A 13.35 1.00e-08 196-208 


2435 


IPB003952 


Fumarate reductase / succinate 
dehydrogenase FAD-binding site 


IPB003952D 19.72 4.50e-20 7-35 
IPB003952E 9.04 2.46e-16 48-65 


2436 


IPB001895 


Guanine-nucleotide dissociation 
stimulators CDC25 family 


IPB001895C 20.83 8.50e-23 52-87 


2437 


IPB000958 


KH domain 


IPB000958 6.84 5.09e-12 173-186 
IPB000958 6.84 2.29e-ll 89-102 


2440 


IPB001393 


Calsequestrin 


IPB001393A 16.72 1.00e-40 66-115 
IPB001393B 11.93 l,00e-40 169-222 
IPB001393C 16.33 1.00e-40 225-277 
IPB001393D 11.26 1.00e-40 320-372 


Z44U 


risXJKjj YL 


v^aiscquesinn signature v 


PR003121 15.97 5.71e-35 363-391 
PR00312H 13 19 2 80e-34 294-321 
PR00312J 13.61 6.48e-34 394-422 
PR00312D 9 10 7 17e-33 159-188 
PR00312B 14.57 4.41e-32 93-122 
PR00312C 16.48 5.62e-32 123-152 
PR00312G 11.43 1.49e-31 261-288 
PR00312F 16.12 1.73e-31 230-259 
PR00312A 1 1.96 7.94e-27 66-89 


2442 


IPB000353 


"Class II histocompatibility antigen, 
beta chain, beta- 1 domain" 


IPB000353B 19.16 4.94e-16 139-188 


2442 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006A 17.51 8.50e-16 160-182 


2442 


IPB001003 


"MHC Class II, alpha chain, alpha-1 
domain" 


IPB001003B 14.72 9.90e-10 147-190 




JT ISAJKJXJa, I 


OlUd.ll pi UllllC-I ICU piULCill algilaLUlC 1 


PR00021 A 3 31 1 35e-19 8-20 
PR00021B5.91 1.00e-14 31-40 
PR00021B 5.91 1.00e-13 22-31 
PR00021D 4.82 1.39e-13 25-33 
PR00021D 4.82 1.39e-13 34-42 
PR00021D 4.82 6.87e-13 43-51 
PR00021B 5 91 1 92e-ll 40-49 
PR00021E7.77 1.23e-10 61-70 
PR00021C5.97 1.25e-10 25-31 
PR00021C 5.97 1.25e-10 34-40 


2444 


PR01217 


Proline rich extensin signature IV 


PR01217D 4.57 4.94e-10 30-51 
PR01217G 4.02 2.42e-09 23-48 
PR01217G 4.02 2.42e-09 30-55 
PR01217G 4.02 2.58e-09 21-46 
PR01217D 4.57 7.89e-09 21-42 
PR01217G 4.02 8.89e-09 39-64 


2444 


IPB000967 


Zinc finger NF-X1 type 


IPB000967E 21.88 9.44e-09 12-52 


2445 


PR00205 


Cadherin signature VI 


PR00205F 19.57 5. 15e-21 522-548 
PR00205B 20.09 5.50e-21 254-283 
PR00205D 12.22 1.39e-15 338-357 
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PR00205B 20.09 2.50e-15 464-493 
PR00205D 12.22 6.09e-15 233-252 
PR00205G 13.05 8.00e-15 556-573 
PR00205A 17.38 2.59e-14 85-104 


2445 


IPB002126 


Cadherin domain 


IPB002126B 12.04 4.79c-14 242-259 
PR00205B 20.09 5.63e-13 145-174 
IPB002126B 12.04 7.43e-13 452-469 
PR00205D 12.22 7.60e-13 443-462 
PR00205G 13.05 7.75e-13 341-358 
PR00205F 19.57 3.38e-l 2 309-335 
PR00205G 13.05 9. 10e-12 236-253 
PR00205E 10.82 3.37e-ll 252-265 
PR00205E 10.82 7.16e-ll 462-475 
PR00205D 12.22 7.59e-ll 553-572 
PR00205F 19.57 9.05e-ll 412-438 
IPB002126A 14.68 4.91e-10 206-222 
IPB002126A 14.68 5.30e-10 416-432 
IPB002126B 12.04 3.25e-09 133-150 
PR00205C 13.59 3. 25e-09 326-338 
IPB002126B 12.04 4.50e-09 347-364 
PR00205B 20.09 9.83e-09 581-610 
PR00205G 13.05 1.00e-08 446-463 


2447 


IPB000006 


"Vertebrate metallothionein, family 
1" 


IPB000006 13.41 3.90e-12 29-74 
IPB000006 13.41 4.41e-12 36-81 
IPB000006 13.41 6.70e-ll 32-77 


2447 


PRO 1228 


Eggshell protein signature III 


PR01228C 5.69 1.22e-10 23-38 
PR01228C 5.69 1.98e-10 7-22 


2447 


IPB001271 


Mammalian defensin 


IPB001271 19.97 3.29e-10 48-76 


2447 


IPB002494 


"Keratin, high sulfur B2 protein" 


IPB002494C 14.46 3.36e-10 42-85 
IPB001271 19.97 3.47e- 10 26-54 
IPB002494A 12.44 6.1le-10 67-100 


2447 


IPB002174 


Furin-like cysteine rich region 


IPB002174A 30.51 7.32e-10 8-39 
PR01228C 5.69 8.05e-10 16-31 


2447 


IPB003571 


Snake toxin 


IPB003571B 18.08 8.07e- 10 73-96 
IPB002494A 12.44 9.08e-10 22-55 


2447 


PR00858 


Crustacean metallothionein signature 
II 


PR00858B 5.93 1.48e-09 37-55 
IPB000006 13.41 3.11e-09 33-78 


2447 


IPB001169 


"Integrin beta, C-terminus" 


IPB001169K 27.45 3.19e-09 39-81 


2447 


IPB002919 


Trypsin Inhibitor-like cysteine rich 
domain 


IPB002919A 15.56 3.57e-09 49-61 
IPB002174A 30.51 4.15e-09 24-55 
IPB001271 19.97 4.44e-09 55-83 
IPB002494A 12.44 4.97e-09 29-62 
PR01228C 5.69 5.03e-09 15-30 
PR01228C 5.69 5.03e-09 19-34 
IPB002174A 30.51 5.28e-09 16-47 


2447 


IPB000254 


"Cellulose-binding domain, fungal 
type" 


IPB000254 18.11 5.36e-09 25-55 
IPB000006 13.41 5.59e-09 39-84 
IPB002174A 30.51 5.72e-09 33-64 
PR01228C 5.69 5.76e-09 24-39 


2447 


IPB000867 


Insulin-like growth factor-binding 
protein 


IPB000867B 11.44 6.55e-09 2-18 
IPB002174A 30.51 6.62e-09 4-35 


2447 


IPB002867 


Cysteine-rich domain (C6HC) 


IPB002867D 24.88 7.19e-09 35-66 
IPB000006 13.41 7.24e-09 47-92 


2447 


IPB000967 


Zinc finger NF-X1 type 


IPB000967D 10.42 7.37e-09 57-92 
IPB001 169K 27.45 7.81e-09 32-74 
1PB000006 13.41 8.07e-09 37-82 
IPB002494A 12.44 8.35e-09 26-59 
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IPB000006 13.41 8.44e-09 52-97 


2447 


PR01117 


CLC-6 chloride channel signature I 


PR01 1 17A 7.79 9.47e-09 48-60 
IPB001271 19.97 9.5 le-09 64-92 
IPB002174A 30.51 9.77e-09 36-67 


2447 


IPB002221 


WAP-type (Whey Acidic Protein) 
four-disulfide core domain 


IPB002221B 17.12 1.00e-08 45-66 


2448 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 4.79e-12 52-77 


2448 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 3.05e-10 49-62 
IPB000822 14.67 9. 14e- 10 200-225 


2450 


PR00946 


Mercury scavenger protein signature 
I 


PR00946A 4.14 8.16e-09 6-24 


2452 


IPB002038 


Osteopontin 


IPB002038C 22.35 L00e-40 173-214 




pp (SC0 1 £ 


wbicoponun signature i 


PR00716A 11 Q71e-^4 4V72 
IPB002038A 12 23 5 15e-3l 42-71 
PR00216C 9.127.82e-21 95-120 
PR00216B 6.709.49e-2t 79-108 
PR00216D 3.16 3.30e-18 142-156 
PR00216E 6 95 3 81e-18 174-188 

A 1W V A* X \J XJ \J • ✓ *J *J ,\J X\S XV X 1 W 1 UU 

IPB002038B 15.58 4.11e-16 77-121 
PR00216D 3.16 3.69e-12 136-150 


2452 


IPB003403 


Herpesvirus immediate early protein 


IPB003403E 17.25 9.26e-09 117-144 
IPB002038B 15.58 9.58e-09 91-135 


2454 


IPB001241 


DNA topoisomerase II family 


IPB001241F 23.94 8.36e-37 475-523 


2454 


PR01158 1 


Topoisomerase II signature VIII 


PR01 158H 13.39 5.50e-30 804-826 
IPB001241G 14.13 1.00e-29 547-573 
PR01158K 14.14 5.24e-27 1023-1049 
PR01 158G 9 37 5 91e-27 757-780 


2454 


IPB002205 


"DNA gyrase/topoisomerase IV, 
subunit A" 


IPB002205B 14.49 4.79e-24 760-795 
IPB001241E 20.94 3.00e-22 371-397 
PR011581 13.95 7.00e-22 834-854 
PR01 158D 11,94 5.24e-21 565-580 


2454 


PR00418 


DNA topoisomerase II family 
signature VI 


PR00418F 13.13 3.40e-20 546-562 
IPB001241A 15.98 6.04e-20 50-71 
IPB001241B 10.04 2.71e-l9 172-190 
PR00418G 12.91 8.94e- 19 564-581 
IPB001241H 17.27 L96e-18 808-831 


2454 


PR00615 


CCAAT-binding transcription factor 
subunit A signature I 


PR00615A 17.09 2.93e-18 319-337 
PR01158J 13.56 3.45e-18 939-953 
IPB002205D 10.13 3.54e-18 867-888 
PR00615B 18.03 3.77e-18 707-725 
PR00418C9.38 1.82e-17 176-190 
PR004181 17.21 4.60e-17 626-642 
IPB002205A 8.13 9.54e-17 729-747 
PR00418A 13.58 7.65e-16 96-111 
PR01158C 11.35 L00e-15 519-532 
PR01158E 8.11 2.29e-15 585-596 

PRM 1 ^RP 1 0 A 71 #»-1 ^ 6^9-644 
rJSAJixJor ixj.oy *t, i ic-i j ojl-o'h 

PR00615C 17.93 8.50e-15 1148-1166 

PR00418E 14.82 1. 3 7e- 14 473-487 

IPB001241D 14.87 1.43e-14 328-341 

PR00418B 12.37 2.57e-14 133-146 

PR00418D 14.25 2.7 le-1 4 328-341 

PR01158A 7.61 4.60e-13 456-466 

IPB002205C 11.89 5.09e-12 812-826 

PR00418H 10.58 5.9 le-12 584-596 

IPB001241C 13.37 1.31e-ll 230-242 


2454 


IPB000509 


Ribosomal protein L36E 


IPB000509B 20.29 7.85e-il 1216-1270 
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PR01158B 8.30 1.27e- 10 471-478 


2454 


IPB000135 


High mobility group proteins HMG1 
andHMG2 


IPB000135D 2.13 5.64e-09 1362-1386 
IPB000135D 2.13 7.45e-09 1363-1387 
IPB000135D 2.13 8.09e-09 1364-1388 


2454 


PR01469 


Bacterial carbamate kinase signature 
V 


PR01469E 10.60 8.43e-09 128-146 
IPB000135D2.13 8.73e-09 1360-1384 


2457 


IPB001073 


Complement Clq protein 


IPB001073A 22.14 6.55e-13 67-101 


2466 


IPB000959 


POLO box duplicated region 


IPB000959D 27.01 9.61e-10 204-256 


2473 


PR01475 


Parkin signature IX 


PR014751 10.01 8.0le-09 96-118 


2476 


IPB003743 


DUF164 


IPB003743B 20.16 4.64e-09 88-126 


94R1 




OvipiUS 


IPR00091 *>C 11 QO 5 OOp-OQ 41S-44Q 


948? 


PR01177 


V^laliUill I alglicllUlG i V 


PR01377D 6 30 1 00e-19 229-241 
PR01377A7.94 1.00e-16 141-152 




11 D\JK)\J / &y 


PTV/fP-99/P M~P/M"P9fi fbmil v 


TPRfl007?Qn 18 9f> 5 S0e-15 1Q7-994 


2482 


PR01077 


Claudin signature III 


PR01077C 13.60 2.53e-12 99-109 
PR01377R 11 7Q 1 19i»-1 1 176-181 

i IV V/ LJI ID i J . / y 1.1 ^C- 111/ \J 1 0 J 

PR01377C 14.12 2.44e-ll 188-195 
PR01077B 14 12 1 00e-10 85-91 
IPB000729C 37.83 5.31e-10 116-168 
PR01077A 9.72 4.49e-09 57-66 


2482 


PR01385 


Claudin-14 signature I 


PR01385A 5.13 5.70e-09 46-62 


2483 


IPB001919 


"Cellulose-binding domain, bacterial 

tvne" 
type 


IPB001919B 14.22 2.97e-09 188-212 


9487 




Tnvacirm nrntpin R familv Qicmntnrf* 
lilVaolUli JJIULCIII Jj lailiiljr olgiicuuic 

IV 


PR01105D 7 82 6 19e-09 266-279 


2488 


IPB002652 


Importin beta binding domain 


IPB002652H 25.98 1.00e-40 568-614 
IPB0026521 18 58 1 36e-35 647-683 


2488 


IPB000225 


Armadillo repeat 


IPB000225E 20.58 8.20e-22 646-668 
IPB002652C 21 73 5 88e-14 519-571 
IPB000225D 18.99 5.02e-13 535-558 
IPB002652F 18.67 9.25e-ll 543-575 
IPB002652G 22.45 L36e-09 535-580 


2488 


IPB003191 


Guanylate-binding protein 


IPB003191M 10.38 7.64e-09 69-99 • 


2490 


IPB001762 


Disintegrin 


IPB001762A 23.93 4.33e-23 19-59 


2490 


PR00289 


Disintegrin signature I 


PR00289A 14.29 L16e-14 35-54 
IPB001762B 10.06 3.40e-12 66-76 


2490 


IPB001774 


Delta serrate ligand 


IPB001774C 18.25 5.31e-10 238-280 
PR00289B 11.74 3.80e-09 64-76 


2490 


IPB003306 


WIF domain 


IPB003306E 25.5 1 7.40e-09 215-260 


2491 


1PB001359 


Synapsin 


IPB001359H 22.58 6.07e-09 96-146 


2495 


IPB001359 


Svnaosin 


IPB001359H 22.58 6.33e-09 35-85 
IPB001359H 22.58 7.73e-09 41-91 ; 


2496 


IPB001359 


Synapsin 


IPB001359H 22.58 6.33e-09 35-85 
IPB001359H 22.58 7.73e-09 41-91 


2497 


IPB001359 


Synapsin 


IPB001359H 22.58 6.33e-09 35-85 
IPB001359H 22.58 7.73e-09 41-91 


2498 


IPB000492 


Protamine 2 (PRM2) 


IPB000492B 5.26 7.95e-09 230-264 


2502 


PRO 14 15 


Ankyrin repeat signature I 


PR01415A 12.73 1.25e-09 187-199 


2502 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 9.31e-09 89-126 


2504 


IPB000492 


Protamine 2 (PRM2) 


IPB000492B 5.26 1.68e-09 219-253 


2504 


PR00580 


Prostanoid EP1 receptor signature V 


PR00580E 8.05 7.1 le-09 226-247 


2505 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 6.54e-09 195-232 


2506 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 6.54e-09 195-232 


2507 


PR00456 


Ribosomal protein P2 signature V 


PR00456E 3.08 9.42e-10 637-651 
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2508 


PRO 1481 

i. IV V/ J. "TO 1 


Npurntensin tvne 2 recentor signature 

in 


PR01481C 15 05 1 00e-17 176-189 i 




PR 01 479 

i xvw i *t / *r 


T\IpiirntprKin rpppntor sipnaturp TT 


PR01479B 12 40 2 43e-17 89-101 
PR01481A 7.58 3.54e-16 M3 
PR01479C7.31 1.00e-15 102-115 
PR01481B6.68 1.45e-15 14-26 
PR0148lD4.62 2.19e-15 190-201 
PR01479E 8 74 3 70e-15 240-250 
PR01479D 13.10 6.57e- 14 229-239 
PR01479A 8 89 1 00e-13 29-39 


2508 


PR00237 


Rhodopsin-like GPCR superfamily 
signature VH 


PR00237G 19 23 4 44e-12 249-275 


2508 


PR00665 


Oxytocin receptor signature IV 


PR00665D 10.30 1.32e-U 134-150 
PR01479F 8.03 5,19e-l 1 277-287 
PR00237C 14.77 4.32e-10 1 15-137 
PR00237A 9.81 7.33e-10 34-58 
PR00237D 9.76 7.43e-10 151-172 


2508 


PR01417 


Growth hormone secretagogue 
receptor type 1 signature IV 


PR01417D 12.33 8.13e-10 111-127 
PR00237F 14.34 6.05e-09 204-228 


2509 


IPB001101 


Plectin repeat 


IPB001101A 10.14 5.40e-14 1-37 


2510 


IPB001101 


Plectin repeat 


IPB001101A 10.14 5.40e-14 1-37 


9S1 7 


TPROftl 


Ar»v1-f"ViA HplivHrnopnaQp 


IPB001552E 22 77 2 46e-19 523-563 
IPB001552D 24.88 5.35e-19 432-474 
IPB001552C 25.04 7.75e-15 378-418 
IPB001552B 18.05 3.43e-12 124-146 
IPB001552A 11.25 6.90e-10 97-108 


2518 


IPB001552 


Acyl-CoA dehydrogenase 


IPB001552E 22.77 2.46e-19 523-563 
IPB001552D 24.88 5.35e-19 432-474 
IPB001552C 25.04 7.75e-15 378-418 
IPB001552B 18.05 3.43e-12 124-146 
IPB001552A 11.25 6.90e-10 97-108 


2519 


IPB002524 


Cation efflux family 


IPB002524B 23.89 5.20e-17 50-89 


2519 


IPB003452 


Stem cell factor 


IPB003452B 19.11 6.63e-09 109-157 
IPB002524A 20.13 7.39e-09 8-48 


2520 


PR00215 


Neuromodulin signature III 


PR00215C 13.82 7.58e-10 478-498 


2520 


PR00194 


Tropomyosin signature IV 


PR00194D 9.54 7.19e-09 357-380 


2520 


IPB001422 


Neuromodulin (GAP-43) 


IPB001422A 13.23 7.43e-09 453-497 


2521 


PR01178 


Metabotropic gamma-aminobutyric 
acid type B2 receptor signature XI 


PR01 178K 13.44 8.65e-09 179-203 


2523 


IPB002889 


WSC domain 


IPB002889B 11.76 4.56e-10 34-80 
IPB002889B 11.76 7.84e-09 19-65 
IPB002889B 11.76 7.84e-09 27-73 
EPB002889B 11.76 1.00e-08 23-69 


2529 


PR00019 


Leucine-rich repeat signature II 


PR00019B 11.42 1.33e-10 225-238 
PR00019A 11.72 8.33e-10 228-241 
PR00019A 11.72 4.00e-09 202-215 
PR00019B 11.42 7.82e-09 199-212 


2530 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 4.60e-10 297-334 


2530 


IPB001000 


Glycoside hydrolase family 10 


IPB001000H 10.38 7.80e-09 13-26 


2531 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 4.60e-10 297-334 


2531 


IPB001000 


Glycoside hydrolase family 10 


IPB001000H 10.38 7.80e-09 13-26 


2532 


1PB003884 


Factor I membrane attack complex 


IPB003884A 12.20 7.06e-09 56-67 


2536 


IPB000822 


"Zinc finger, C2H2 type" 


IPB000822 14.67 7.50e-13 309-334 


2536 


PR00048 


C2H2-type zinc finger signature I 


PR00048A 9.94 4.18e-12 306-319 
IPB000822 14.67 5.74e-12 281-306 
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2536 


PR00258 


Speract receptor signature I 


PR00258A 13.56 2.98e-10 87-103 


2536 


IPB002867 


v^ybieine-ncn uumain ^onoj 


TPR009Rfi7P 1Q Afk Q 75*» in 70/C ill 


9540 


TPR001 529 


rany acta aesaiurase, type i 


TPnooi ^77c 77 1 An io/i icq 
IPB001522E 20.55 5.85e-36 26-79 


9540 


PR00075 


Pally dwlu UCaaLUIdbC laillliy 1 
cf onijfiirr* \/TT 

olglialUll/ V 11 


pbooo750 m 50 fi tn* 9n in 1/k 

xIvvvv/JVJ Ivwl/ v.OiX-ZvJ 

prooo75f 1 1 fin 6 46p-i s 55 71 
PR00075F 14.62 8.81e-16 88-109 


2541 


IPB000432 


"HMA m i cm at/*li rptvair nmtpin MntS 
J-JlN/Y liiiMllaLUii ic^faii piviv/in iviuu 

family, C-terminal domain" 


rPR0004^9D 18 8^ 8 Q9p-1Q ^^Q-A17 

IPB000432C 12.07 1.00e-37 329-360 
IPB000432F 16 97 3 86e-97 476-507 
IPB000432E 8.78 9.00e-13 441-451 


2541 


IPB002156 


RNase H 


IPB002156B 1 1 33 2 20e-1 1 1 00-1 1 0 


2542 


IPB003006 


Immunoglobulin and major 

hictnf , f"\mnatiHilif'v ^nmnlpY Hotnain 
iiioivvviii|Jaiiui iiiy (^vJiiipitiA uviuaiii 


IPB003006B 20.23 8.20e-10 33-70 


2543 


IPB000998 


MAM domain 


IPB000998C 18.63 1.95e-12 17-32 




PR00090 


lVLrvivi uomain signdiurc in 


PP00090P 19 01 8 19p 10 1£ 77 
TPR0009Q8D 18 fifi Q filp-10 89-105 

li DUUU770L' IQ.wv 7.V/IC-1V/ OZ.-IV/J 


2544 


IPB002350 


K"a*7Jii-tvnp cprinp nrntpncp inhiHitnr 
ivtiz-a.1 uypv odiiiVs pivtvaov imiii/iivi 

family 


IPB002350 31 78 1 Q2e-1^ 46-86 


2544 


IPB003006 


Immunoglobulin and major 
histocomnatibilitv eomnlex domain 


IPB003006B 20.23 1.78e-ll 150-187 


2545 


PR00449 


Transforming nrotein P21 ras 
signature I 


PR00449A 12 48 8 16e-10 86-107 

X J-VV/ \J^~ iS AX. lA<TU U. 1VV IV UU XV/ 


2545 


PR00326 


GTP1/OBG GTP-bindine Drotein 
family signature I 


PR00326A 8 70 9 13e-10 88-108 


2545 


IPB000619 


Guanvlate kinase 


IPB000619A 18 08 4 21e-09 88-105 


2545 


PR00364 


Disease resistance protein signature I 


PR00364A 8.29 7.14e-09 87-102 


2545 


PR00094 


A den vl ate kinase sitmntirrp T 


PR00094A 9 62 9 57e-09 89-102 


2545 


PR00918 


Oalirivims non-stnipfiiral nnlvnrntein 

family signature I 


PR00918A 13 81 9 69e-09 82-102 


2545 


IPB000795 


GTP-bindinff elongation factor 


IPB000795A 10 67 9 77e-09 87-102 ! 


2547 


IPB003006 


Immunoglobulin and major 
histocompatibility complex domain 


IPB003006B 20.23 3.08e-09 4-41 


2548 


PR00698 


C.elegans Srg family integral 
membrane protein signature V 


PR00698E 14.65 2.76e-09 95-120 


2551 


IPB001737 


Ribosomal RNA adenine 
dimethyl ase 


IPB001737A27.il 8.54e-10 135-180 


2553 


IPB000906 


ZU5 domain 


IPB000906A 22.49 3.16e-09 38-80 


2554 


IPB001245 


Tyrosine kinase catalytic domain 


IPB001245B 21.68 6.54e-13 281-319 


2554 


IPB000095 


PAK-box /P21-Rho-binding 


IPB000095F 16.47 3.97e-ll 285-339 


2554 


IPB000961 


Protein kinase d-terminal domain 


IPB000961D 21 23 2 22e-10 277-318 
IPB001245A 22.45 3.18e-10 228-268 


2555 


IPB001245 


Tvrosine kinase eatalvtie domain 


IPB00124SB 21 68 6 54e-13 281-319 

XX JLJUul^TJU i>l,UU V.»/^v U ArO 1 -J I. *J 


2555 


IPB000095 


PAK-hnx /P21-Rho-hindini* 


IPB000095F lfi 47 3 97e-l 1 285-339 

11 JJUUwyJx 1\J.*T / J.J'/C 11 AtOi/ JJ7 


2555 


IPB000961 


Protein kinase C-terminal domain 


IPB000961D 21.23 2.22e-10 277-318 
IPB001245A 22.45 3.18e-10 228-268 


2557 


PR01041 


Methionyl-tRNA synthetase 
signature V 


PR01041E 16.72 2.69e-17 60-75 
PR01041D 1 1.02 7.43e-13 30-41 . 


2557 


IPB001412 


Aminoacyl-transfer RNA synthetases 
class-I 


IPB001412B 6.33 8.71e-12 98-108 


2558 


IPB000353 


"Class II histocompatibility antigen, 
beta chain, beta-1 domain" 


IPB000353A 18.51 7.30e-27 41-90 


2563 


IPB001599 


Alpha-2-macroglobulin family 


IPB001599L 18.66 4.15e-28 59-86 


2563 


IPB001134 


"Netrin, C-terminus" 


IPB001 134C 17.82 4. 13e-13 72-86 
IPB001599K 8.15 1.46e-10 29-40 
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686 


hormone 


Somatotropin hormone 
family 


3.1e-27 


103.9 


1 


9-182 


688 


hormone 


Somatotropin hormone 
family 


4.2e-37 


136.7 


1 


9-176 


689 


serpin 


Serpin (serine protease 
inhibitor) 


1.8c-74 


260.8 


1 


51-397 


690 


efhand 


EF hand 


2.7e-08 


41.0 




34-62:70-98 


691 


Lipase 3 


Lipase (class 3) 


2.3e-20 


8U 




366-505 


692 


PH 


PH domain 


0.028 


21.0 


i 


36-127 


694 


GDA1_CD39 


GDA1/CD39 (nucleoside 
phosphatase) family 


4.2e-51 


183.2 


i 


93-483 


695 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


3.3e-21 


83.9 


i 


22-294 


696 


lectin c 


Lectin C-type domain 


5.1e-06 


33.3 


-j 


181-286 


698 


GDA1_CD39 


GDA1/CD39 (nucleoside 
phosphatase) family 


3.8e-42 


153.5 


i 


40-402 


700 


myb_DNA- 
binding 


Myb-iike DNA-binding 
domain 


9.3e-09 


42.5 


-j 


231-278 


700 


ZZ 


Zinc finger, ZZ type 


0.021 


17.8 





168-211 


702 


zf-ANl 


ANl-like Zinc finger 


0.0034 


18.0 


-f 


10-52:103-138 


703 


CRAL TRIO 


CRAI/TRIO domain 


2.5e-41 


150.7 




85-280 


703 


CRAL TRIO 
N 


CRAL/TRIO, N-terminus 


5.9e-10 


46.5 


i 


3-71 


704 


Rhomboid 


Rhomboid family 


0.019 


-10.9 




152-307 


705 


GKAP 


Guanylate-kinase-associated 
protein (GKAP) p 


7e-292 


983.1 




621-979 


706 


LBP_BPI_CE 
TP C 


LBP/BPI/CETP family, 
C-terminal do 


4.6e-06 


33.6 




218-456 


707 


Glyco_transf_8 


Glycosyl transferase family 
8 


0.0021 


-38.4 




103-368 


708 


LIM 


LIM domain 


7.8e-14 


59.4 




12-68 


710 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-169 


574.2 


20 


56-114:115-174:187- 

245:291-349:360- 

418:423-483:492- 

550:598-656:684- ' 

743:750-808:809- 

868:869-928:929- 

988:1032-1090:1096- 

1154:1155- 

1214:1217- 

1277:1278- 

1337:1341- • 

1400:1417-1476 


710 


C4 


C-terminal tandem repeated 
domain in type 4 


1.5e- 
148 


506.9 


2 


1489-1596:1597- 
1711 


711 


ldl_recept_a 


Low-density lipoprotein 
receptor domain 


0 


1307.3 


32 


67-108:112-152:880- 

920:921-961:962- 

1001:1002- 

1041:1042- 

1081:1088- 

1127:1130- 

1170:1171- 

1212:2545- 

2586:2587- 

2625:2626- 

2664:2676- 
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2713:2717- 
2755:2756- 
2795:2796- 
2838:2840- 
2879:2880- 
2923:2926- 
2964:3352- 
3391:3392- 
3430:3431- 
3470:3471- 
3510:3511- 
3549:3550- 
3588:3589- 
3626:3629- 
3667:3668- 
3706:3709- 
3749:3750- 
3790:3797-3835 


711 


ldl _recept_b 


Low-density lipoprotein 
receptor repeat 


2.4e- 
239 


808.6 


34 


332-373:375- 

417:419-461:605- 

646:648-692:694- 

742:744-791:1337- 

1382:1384- 

1425:1427- 

1472:1474- 

1517:1518- 

1558:1655- 

1696:1698- 

1740:1742- 

1780:1782- 

1825:1959- 

2000:2002- ! 

2043:2045- j 

2087:2089- 

2131:2276- 

2315:2318- 

2365:2367- 

2410:2412- 

2453:2454- 

2495:3092- 

3134:3136- 

3177:3179- 

3221:3223- 

3260:3262- 

3303:3970- 

4016:4018- 

4074:4076- 

4118:4120-4163 


711 


EGF - 


EGF-like domain 


L8e-28 


108.0 


36 


69-106:157-190:196- 

230:512-553:835- 

870:1004-1039:1043- 

1079:1090- 

1125:1173- 

1210:1213- 

1249:1255- 

1289:1568- 
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1606:1875- 
1911:2184- 
2219:2505- 
2540:2589- 
2623:2635- 
2662:2719- 
2753:2928- 
2962:2967- 
3003:3009- 
3041:3314- 
3350:3513- 
3547:3552- 
3586:3590- 
3624:3669- 
3704:3752- 
3788:3842- 
3879:3885- 
3917:4213- 
4244:4254- 
4285:4290- 
4321:4326- 
4357:4362- 
4393:4398- 
4428:4431-4463 


712 


ldl_recept_a 


Low-density lipoprotein 
receptor domain 


4.7e-21 


83.4 


2 


67-108:112-152 


714 


cadherin 


Cadherin domain 


0 


1168.1 


16 


47-126:140-241:255- 

344:363-466:480- 

573:588-680:694- 

784:798-884:898- 

987:1001-1091:1105- 

1201:1215- 

1306:1320- 

1411:1425- 

1520:1526- 

1622:1634-1728 


715 


cadherin 


Cadherin domain 


0 


1177.0 


16 


47-126:140-241:255- 

344:363-466:480- 

573:588-680:694- 

784:798-884:898- 

987:1001-1091:1105- 

1201:1215- 

1306:1320- 

1411:1425- 

1520:1526- 

1622:1634-1729 


716 


DPPIVNjer 
m 


Dipeptidyl peptidase IV 
(DPP IV) N-termi 


1.2e-07 


-81.3 


1 


132-652 


716 


Peptidase_S9 


Prolyl oligopeptidase family 


1.7e-06 


35.0 


1 


656-736 


717 


zf-C2H2 


Zinc finger, C2H2 type 


3.6e-71 


249.9 


10 


32-54:60-82:154- 
176:182-204:210- 
232:238-260:266- 
288:294-316:322- ! 
344:350-372 


720 


ig 


Immunoglobulin domain 


2.8e- 
178 


605.6 


15 


68-128:163-223:259- 
317:352-410:445- 
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503:538-596:629- 

687:720-780:813- 

871:904-962:995- 

1052:1085- 

1143:1176- 

1232:1266- 

1323:1356-1413 


720 


tsp_l 


Thrombospondin type 1 
domain * 


5e-87 


302.5 


6 


1435-1485:1492- 

1542:1549- 

1599:1606- 

1656:1663- 

1713:1720-1770 


720 


EGF 


EGF-like domain 


1.6e-32 


121.5 


8 


2013-2047:2053- 

2092:2098- 

2130:2136- 

2172:2178- 

2215:2221- 

2256:2338- 

O^T>. 01*70 1A"\Q 

15 11.15 /o-z41o 


721 


SPRY 


SPRY domain 


2.7e-29 


110.7 


1 


289-418 


721 


SAP 


SAP domain 


6.9e-09 


43.0 


1 


3-37 


722 


ABC tran 


ABC transporter 


le-105 


364.6 


2 


5 1 0-692 : 1 322- 1 506 


724 


AcyI-CoA_dh 


Acyl-CoA dehydrogenase, 
C-terminal domain 


1.6e-49 


178.0 


1 


50-201 


725 


EGF 


EGF-like domain 


1.9e-18 


74.7 


5 


65-91 :9o-132:13o- 
172:178-217:223-258 


725 


MAM 


MAM domain 


1.7e-13 


58.3 


1 


402-546 


726 


NHL 


NHL repeat 


5.4e-67 


236.0 


6 


431-458:478- 

00j:jzDOj/:j/z- 

599:619-646:666-693 


726 


Filamin 


Filamin/ABP280 repeat 


6.9e-18 


72.9 


1 


306-402 


726 


zf-B box 


B-box zinc finger 


5.6e-05 


30.0 


1 


98-139 


727 


RhoGAP 


RhoGAP domain 


2.3e-50 


180.8 


1 


775-947 


727 


DAGJ-E-bind 


Phorbol 

esters/diacylglycerol 
binding dom 


0.0004 


21.8 


1 


703-747 


728 


CNJiydrolase 


Carbon-nitrogen hydrolase 


0.0048 


-84.5 


1 


25-261 | 


729 


tsp_l 


Thrombospondin type 1 
domain 


6.9e-32 


119.4 


11 


570-623:980- 

1034:1037- 

1089:1092- 

1146:1165- 

1220:1221- 

1276:1313- 

1364:1367- 

1420:1426- 

1479:1482- 

1535:1543-1593 


729 


Reprolysin 


Reprolysm (M12B) family 
zinc metallo 


1 la. 1 H 

l.3e-io 


Oo.O 


1 


Z/*fr-*frov 


729 


Pep_M12B_pr 
opep 


Reprolysin family 
propeptide 


4.8e-10 


46.8 


1 


93-223 


731 


ig 


Immunoglobulin domain 


5.1e-12 


53.4 


3 


6-99:146-235:282- 
373 - 


732 


ig 


Immunoglobulin domain 


1.6e-l6 


68.3 


4 


42-129:179-272:319- 
408:455-546 


735 


RhoGEF 


RhoGEF domain 


3e-10 


47.5 


1 


165-340 
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737 


rrm 


RNA recognition motif. 


l.le-26 


102.1 


3 


78-142:151-222:240- 
311 


742 


cadherin 


Cadherin domain 


3.6e- 
100 


346.2 


6 


147-243:257- 

349:369-460:474- 

563:577-666:685-773 


743 


PGM_PMMJI 


Phosphoglucomutase/phosp 
homannomutase, alp 


0.08 


-11.7 


1 


67-179 


745 


zf-C2H2 


Zinc finger, C2H2 type 


2.5e- 
108 


373.3 


15 


130-152:158- 

180:186-208:214- 

236:242-264:270- 

292:298-320:326- 

348:354-376:382- 

404:410-432:438- 

460:488-510:516- 

538:544-566 


746 


zf-C2H2 


Zinc finger, C2H2 type 


9.2e-91 


314.9 


12 


205-227:233- 

255:261-283:289- 

311:317-339:345- 

367:373-395:401- 

423:429-451:457- 

479:485-507:513-535 


746 


KRAB 


KRAB box 


2.3e-23 


91.1 - 


1 


35-75 


747 


EMP24 GP25 
L 


emp24/gp25L/p24 family 


1.2e-79 


278.0 


1 


5-201 


748 


acidjhosphat 


Histidine acid phosphatase 


2.5e- 
158 


539.4 


1 


31-371 


749 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


8.7e-60 


212.0 


1 


527-649 


749 


PH 


PH domain 


8e-17 


69.3 


1 


393-487 


749 


ank 


Ankyrin repeat 


4.6e-15 


63.5 


3 


826-858:859- 
891:892-925 


751 


zf-C2H2 


Zinc finger, C2H2 type 


3.3e-43 


157.0 


6 


603-625:631- 

653:693-715:721- 

743:751-773:779-801 


751 


KRAB 


KRAB box 


9.5e-20 


79.0 


1 


342-382 


753 


LRR 


Leucine Rich Repeat 


2e-30 


114.5 


8 


61-82:83-106:107- 
131:132-155:156- 
179:180-203:204- 
227:228-251 


753 


LRRCT 


Leucine rich repeat C- 
terminal domain 


3.6e-07 


37.2 


1 


261-311 


754 


A2M 


Alpha-2-macroglobulin 
family 


3.4e- 
195 


661.8 


1 


721-1469 


754 


A2M_N 


Alpha-2-macroglobulin 
family N-terminal regi 


1.6e-88 


307.5 


1 


1-623 


755 


fibrinogen_C 


Fibrinogen beta and gamma 
chains, C-term 


4.6e-24 


93.4 


1 


242-422 


/DO 


tni 


Fibronectin type III domain 


I. le-53 


1 C\(\ f\ 

190.9 


4 


598-687:700- 
790:802-891:903-986 


756 


ig 


Immunoglobulin domain 


1.6e-49 


177.9 


6 


43-102:137-198:242- 

299:332-388:424- 

481:514-579 


758 


LRR 


Leucine Rich Repeat 


1.2e-28 


108.6 


7 


52-75:76-99:100- 
123:124-147:148- 
171:172-195:196-216 


758 


ig 


Immunoglobulin domain 


5.2e-07 


36.7 


1 


301-359 
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758 


LRRCT 


Leucine rich repeat C- 
terminat domain 


0.00013 


28.8 




240-285 


759 


7tm_2 


7 transmembrane receptor 
(Secretin family) ! 


2.3e-20 


81.1 


■ 


1009-1273 


759 


GPS 


Utrophilin/CL-l-like GPS 
domain 


7.1e-I3 


56.2 




950-1002 


759 


ig 


Immunoglobulin domain 


3.3e-08 


40.7 




286-352:485-547 


759 


SEA 


SEA domain 


0.043 


20.1 


! 


168-279 


760 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-85 


297,6 


12 


107-129:135- 

157:163-185:191- 

213:219-241:247- 

269:275-297:303- 

325:331-353:359- 

381:387-409:415-437 


764 


HIT 


HIT family 


0.00082 


-4.2 


1 


173-273 


768 


SRCR 


Scavenger receptor 
cysteine-rich domain 


2e49 


177.6 


2 


32-129:142-247 


768 


Lysyl oxidase 


Lysyl oxidase 


4.5e-41 


149.9 


1 


251-359 


769 


GlycoJransf_8 


Glycosyl transferase family 
8 


4.7e-06 


-2.1 


1 


1-250 


770 


WD40 


WD domain, G-beta repeat 


4.4e-07 


37.0 


3 


215-251:365- 1 
401:407443 


773 


Cytidylyltrans 


Phosphatidate 
cytidylyltransferase 


8e-92 


318.5 


1 


221401 


774 


WD40 


WD domain, G-beta repeat 


1.5e-08 


4L8 


2 


166-203:327-363 


779 


HesB-iike 


HesB-like domain | 


3.5e-36 


133.6 


1 


49-151 


780 


ig 


Immunoglobulin domain 


0.014 


22.0 


2 


8-57:96-155 


783 


vwa 


von Willebrand factor type 
A domain 


2.1*42 


154.3 


1 


266-440 


783 


Kunitz_BPTI 


Kunitz/Bovine pancreatic 
trypsin inhibito 


1.7e-18 


74.8 


1 


540-590 


783 


Collagen 


Collagen triple helix repeat 
(20 copies) 


0.014 


-13.0 


4 


2-60:61-117:118- 
175:181-239 


784 


Sterol desat 


Sterol desaturase 


6.4e-46 


166.0 


1 


57-263 


785 


ig 


Immunoglobulin domain 


2e-32 


121.1 


4 


116-176:331- 

391:1355-1415:1552- 

1613 


786 


•adenylatekinas 
e 


Adenylate kinase 


2.6e-08 


-30.8 


1 


35-189 


788 


SH3 


SH3 domain 


6.7e-l3 


56.3 


1 


1-56 


789 


SH3 


SH3 domain 


1.6e-14 


61.6 


1 


73-129 


790 


TIMP 


Tissue inhibitor of 
metalloproteinase 


l.le-40 


148.5 


1 


15-124 


791 


lectin c 


Lectin C-type domain 


5.1e-06 


33.3 


1 


162-267 


792 


UDPGT 


UDP-glucoronosyl and 
UDP-glucosyl transferas 


5e-237 


800.8 


1 


1-447 


794 


Ubie_methyltr 
an 


ubiE/COQ5 

methyltransferase family 


6.3e-05 


-96.3 


1 


37-24 1 


794 


PCMT 


Protein-L-isoaspartate(D- 
aspartate) O 


0.038 


-104.6 


1 


23-192 


795 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


6.9e-31 


116.0 


1 


444-720 


799 


PH 


PH domain 


2.8e-18 


74.1 


1 


14-112 


804 


ig 


Immunoglobulin domain 


0.0006 


26.5 


2 


35-111:146-197 


809 


ig 


Immunoglobulin domain 


0.0014 


25.4 


1 


109-171 
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811 


MHCJ 


Class I Histocompatibility 
antigen, domains 


l.le-06 


4.5 


1 


29-205 


812 


ig 


Immunoglobulin domain 


5.4e-41 


149.6 


5 


78-137:176-237:274- 
335:369-430:465-529 


813 


ig 


Immunoglobulin domain 


2.2e- 
103 


356.8 


12 


295-358:393- 

452:1468-1530:1565- 

1627:1662- 

1724:1761- 

1823:1858- 

1926:1961- 

2020:2059- 

2120:2157- 

2218:2252- 

2313:2348-2412 


814 


ig 


Immunoglobulin domain 


2.2e- 
103 


356.8 


12 


490-553:588- 

647:1663-1725:1760- 

1822:1857- 

1919:1956- 

2018:2053- 

2121:2156- 

2215:2254- 

2315:2352- 

2413:2447- 

2508:2543-2607 


814 


LRR 


Leucine Rich Repeat 


l.le-25 


98.8 


6 


58-81:82-105:106- 

129:130-153:154- 

177:186-209 


814 


LRRCT 


Leucine rich repeat C- 
terminal domain 


7.1e-09 


42.9 


1 


219-280 


814 


LRRNT 


Leucine rich repeat N- 
terminal domain 


0.00025 


27.8 


1 


28-56 


816 


Apolipoprotein 


Apolipoprotein A1/A4/E 
family 


1.6e-06 


34.6 


1 


4-251 


817 


Apolipoprotein 


Apolipoprotein A1/A4/E 
family 


1.6e-06 


34.6 


1 


4-251 


819 


phoslip 


Phospholipase A2 


3.3e-48 


173.6 


1 


21-145 


821 


MRMLE 


Mandelate racemase / 
muconate lactonizing en 


4.6e-05 


-4.2 




149-386 


821 


MRMLE_N 


Mandelate racemase / 
muconate lactonizing en 


0.0031 


-0.4 




1-112 


822 


NAP 


Nucleosome assembly 
protein (NAP) 


1.7e- 
190 


646.3 


1 


12-285 


823 


PP2C 


Protein phosphatase 2C 


6.2e-72 


252.4 




Art /\ HO 

107-383 


824 


vwc 


von Willebrand factor type 
C domain 


3.8e-13 


57.1 




103-157:160-214 


825 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


0.00045 


-23.4 




1-173 


826 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


z.2e-4U 






40-987 


828 


RhoGAP 


RhoGAP domain 


l.9e-26 


101.3 




101-250 


829 


CUB 


CUB domain 


l.le-27 


105.4 




2-102 


830 


CUB 


CUB domain 


l.le-27 


105.4 




2-102 


831 


myosin_head 


Myosin head (motor 
domain) 


9.7e-15 


-285.0 




37-318 


832 


myosinjiead 


Myosin head (motor 
domain) 


4.9e-23 


-122.5 


1 


37-408 
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Repeats 
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834 


thyroglobulin_ 
1 


Thyroglobulintype-1 repeat 


l.le-20 


82.1 


1 


316-379 


834 


kazal 


Kazal-type serine protease 
inhibitor 


l.Se-06 


35.2 


1 


139-183 


838 


LRR 


Leucine Rich Repeat 


9.7e-45 


162.1 


12 


61-84:85-108:109- 

132:133-156:157- 

180:181-204:205- 

228:229-252:253- 

276:277-300:301- 

324:326-349 


838 


LRRCT 


Leucine rich repeat C- 
terminal domain 


7.5e-09 


42.8 


1 


359-405 


838 


LRRNT 


Leucine rich repeat N- 
terminal domain 


0,031 


20.9 


1 


31-59 


841 


ank 


Ankyrin repeat 


8e-33 


122.5 


4 


1-27:29-61:130- 
162:164-196 


841 


SAM 


SAM domain (Sterile alpha 
motif) 


0.0031 


24.2 


1 


577-640 


844 


ig 


Immunoglobulin domain 


6.3e-39 


142.8 


4 


53-110:15 0-2 1 6:255- 
310:350-417 


845 


ig 


Immunoglobulin domain 


5e-56 


199.5 


6 


53-1 10: 150-216:255- 

31U:JjU-41 /.4jO- 

516:553-617 


845 


MAM 


MAM domain 


1 1 ^ CO 

1.3e-52 


1 co o 
loo. 2 


i 
1 




847 


PLA2_B 


Lysophospholipase catalytic 
domain 


4.6e-50 


179.8 


1 


1108-1551 


847 


C2 


C2 domain 


l.oe-Uo 


33.1 


i 
1 


/y /-oou 


848 


PLA2JB 


Lysophospholipase catalytic 
domain 


8.3e-53 


1 oo o 


l 


J J /-oUU 


848 


C2 


C2 domain 


1.6e-06 


35.1 


1 


46-129 


851 


ig 


Immunoglobulin domain 


3.6e-31 


117.0 


3 


48-105:169-227:265- 

1AA 

j44 


852 


ig 


Immunoglobulin domain 


3.6e-31 


117.0 


3 


44-101:165-223:261- 

1A A 


853 


ig 


Immunoglobulin domain 


2.8e-07 


37.6 


1 


44-101 


854 


C2 


C2 domain 


1.3e-70 


248.0 


2 j 


ico lyic.oQQ inn 


855 


tsp_l 


Thrombospondin type 1 
domain 


1.7e-26 


101.5 


6 


546-596:827- 
ooi:94j-yyj. i j 14- 
1364:1426- 

1/1*71 .1AHA 1 <1H 
14/ 1:14/4-1jjU 


855 


Reprolysin 


Reprolysin (M12B) family 
zinc metallo 


1.3e-15 


65.3 


1 


246-456 


855 


Pep_Ml2B_pr 
opep 


Reprolysin family 
propeptide 


9.2e-05 


8.5 


1 


105-222 


857 


abhydrolase_2 


Phospholipase/Carboxyleste 
rase 


0.051 


-67.3 


I 


120-326 


0 CO 


aDnyarolase_z 


Phospholipase/Carboxyleste 
rase 


U.UD i 


-Of .J 


l 




859 


SRCR 


Scavenger receptor 
cysteine-rich domain 


3e-20 


80.7 


i 


336-433 


859 


Collagen 


Collagen triple helix repeat 
(20 copies) 


2.1e-l2 


54.7 


i 


255-314 


860 


SRCR 


Scavenger receptor 
cysteine-rich domain 


2e-33 


124.5 


i 


396-493 


860 


Collagen 


Collagen triple helix repeat 


9.1e-13 


55.8 


i 


315-374 
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ID 
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E- 
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Score 


Repeats 


Position 






(20 copies) 










862 


zf-C2H2 


Zinc finger, C2H2 type 


1.6e-89 


310.8 


12 


192-214:220- 

242:248-270:276- 

298:304-326:332- 

354:360-382:388- 

410:416-438:444- 

466:472-494:500-523 


864 


zf-CCCH 


Zinc finger C-x8-C-x5-C- 
x3-H type 


le-06 


35.8 


I 


52-78 


865 


WD40 


WD domain, G-beta repeat 


5.7e-12 


53.2 


3 


203-238:271- 
307:360-393 


867 


aminotran 3 


Aminotransferase class-Ill 


3.3e-98 


339.7 


1 


76-509 ] 


868 


aminotran 3 


Aminotransferase class- III 


6.8e-48 


172.5 


1 


2-406 


869 


trypsin 


Trypsin 


7e-63 


222.3 


1 


63-289 


870 


Glycos transf 
1 


Glycosyl transferases group 
1 


1.8e-06 


33.8 


1 


86-239 


873 


EGF 


EGF-like domain 


1.2e- 
120 


414.3 


16 


7-43:50-81:88- 

119:126-157:168- 

199:203-234:243- 

279:280-311:319- 

350:358-389:396- 

427:492-523:530- 

561:568-599:606- 

637:1046-1077 


873 


m3 


Fibronectin type III domain , 


4.1e-34 


126.7 


3 


641-722:740- 
823:839-921 


873 


sushi 


Sushi domain (SCR repeat) 


3.8e-05 


30.5 ■ 


1 


433-486 


875 


AdoHcyase 


S-adenosyl-L-homocysteine 
hydrolase 


1.5e- 
280 


945.4 - 


1 


81-507 


878 


fibrinogen^ 


Fibrinogen beta and gamma 
chains, C-term 


7.4e-54 


192.3 


1 


146-382 


879 


fibrinogen_C 


Fibrinogen beta and gamma 
chains, C-term 


7.4e-54 


192.3 


1 


146-382 


880 


fibrinogen_C 


Fibrinogen beta and gamma 
chains, C-term 


7.4e-54 


192.3 




146-382 


883 


aa_permeases 


Amino acid permease 


3.9e-07 


-148.3 


1 


40-475 


883 


Aajrans 


Transmembrane amino acid 
transporter pro 


0.0067 


-123.4 


1 


42-460 


884 


pkinase 


Protein kinase domain 


9.3e-06 


-52.2 


i 


100-659 


885 


lectin c 


Lectin C-type domain 


0.0011 


6.9 


1 


47-128 


888 


Peptidase_M20 


Peptidase family 
M20/M25/M40 


0.00043 


16.2 




55-357 \ 


889 


sugar_tr 


Sugar (and other) 
transporter 


0.017 


-118.8 


1 


1-335 


891 




Immunoglobulin domain 


7.5e-05 


29.6 


1 


55-127 


892 


bromodomain 


Bromodomain 


6.9e-87 


302.1 




63-152:356-445 


893 


OLF 


Olfactomedin-like domain 


l.Ze- 
120 


A 1 A 1 

414.Z 




ZZU-*t/U 


894 


ig 


Immunoglobulin domain 


7.1e-16 


66.2 




262-322:354-414 


894 


kazal 


Kazal-type serine protease 
inhibitor domain 


le-09 


45.7 




88-132 


894 


efhand 


EF hand 


0.0013 


25.4 




178-206 


895 


aminotran_l_2 


Aminotransferase class I 
and II 


8.5e-ll 


49.3 




81-416 


896 


LIM 


LIM domain 


5.4e-42 


152.9 


4 


24-80:83-140:153- 
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ID 
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Jt!> 

value 


Score 


Kepeats 


r os in on 














9AQ-719 971 

zuy.ziz-z/ 1 




896 


\ rim 

VHP 


: — : : : ~ — 

Villin headpiece domain 


7 U OA 

/.ie-zu 




i 
i 


500 ^oq 


897 


pkinase 


Protein kinase domain 


8.1e- 

1UZ 


351.7 


i 


356-613 


898 


pkinase 


Protein kinase domain 


8.1e- 
102 


351.7 


i 


543-800 


898 


DCX 


Doublecortin 


j./e-iv/ 


40.O 


i 
l 


1 OA 1 OA 


899 


CST_C 


Glutathione S-transferase, 
C-terminal domain 


0.088 


11.8 


i 


254-370 


900 


Clq 


Clq domain 


7.6e-72 


252.1 


i 


116-241 1 


900 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8.4e-06 


32.7 


1 


OO AO 


902 


BRCT 


BRCAl C Terminus 
(BRCT) domain 


/!« AO 

4e-9z 


Tin r 

3 19. J 


0 


1A QO-QA 1 QO./17Q 

570:579-666:737- 


903 


BRCT 


BRCAl C Terminus 
(BRCT) domain 


z. /e-uo 




i 
1 




905 


LRRCT 


Leucine rich repeat C- 
terminal domain 


7 <e» AO 


49 R 
*KZ.o 


1 
1 




905 


LRR I 


Leucine Rich Repeat 


u.uuoo 


9** 1 


1 
I 


4-97 


906 


ig 


Immunoglobulin domain 


0.002 


24.8 


1 


25-79 


907 


TB2 DPI HV 
A22 


TB2/DP1, HVA22 family 


le-34 


IOC 9 
IZo. / 


i 
1 


9 


908 


An_peroxidase 


Animal haem peroxidase 


o 1 no 

3e-193 


ODD. 4 


1 


77A 1 1AQ 

/ /u-i Duy 


908 




Immunoglobulin domain 


le-34 


IOC 0 

lZo.o 


A 


994 98^*390,- 
ZZf -Zo J . JZU- 

376-409-472-533-590 


908 


LRR 


Leucine Rich Repeat 


4.7e-22 


86.7 


4 


51-74:75-98:99- 

199.193-146 


908 


LRRCT, 


Leucine rich repeat C- 
terminal domain 


8.4e-ll 


49.3 


1 


156-208 


908 


vwc 


von Willebrand factor type 
C domain 


7e-08 


39.6 


1 


1439-1494 


908 


TILa 


TILa domain 


0.023 


12.0 


1 


1438-1491 


909 


An_peroxidase 


Animal haem peroxidase 


la. 1 AO 

Je-lvi 




1 
1 




909 


ig 


Immunoglobulin domain 


1 O/f 

ie-J4 


1 9Q C 
lZo.o 




9^-314 , 3 < >1 - i 
407 -440-503 '564-62 1 


909 


LRR 


Leucine Rich Repeat 


4.7e-22 


86.7 


4 


82-105:106-129:130- 
153-154-177 


909 


LRRCT 


Leucine rich repeat C- 
terminal domain 


8.4e-ll 


49.3 


1 


187-239 


909 


vwc 


von Willebrand factor type 
C domain 


7e-08 


39.6 


1 


1470-1525 


909 


TILa 


TILa domain 


n nil 


1Z.U 


1 
1 


1460- 1 *>99 


910 


An_peroxidase 


Animal haem peroxidase 


3e-193 


655.4 


1 


663-1202 


910 


ig 


Immunoglobulin domain 


3.2e-24 


93.9 


3 


201-260:297- 
353:386-449 


A1 A 

910 


T T> D 

LKK 


.Leucine Kicn itepeat 


9 R 
Z.Oc- lo 


74 T 


4 

•* 


5 1-74 -75-98 -99- 
122:123-146 


910 


VWC 


von Willebrand factor type 
C domain 


7e-08 


39.6 


1 


1332-1387 1 


910 


TILa 


TILa domain 


0.023 


12.0 


1 


1331-1384 


911 


EGF 


EGF-like domain 


3.1e-50 


180.3 


9 


47-99:106-141:172- 
203:210-245:574- 
605:823-854:861- 
892:901-933:940-971 
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Repeats 
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911 


laminin G 


Laminin G domain 


0.0002 


25.1 


2 


275-401:663-788 


914 


cNMP_binding 


Cyclic nucleotide-binding 
domain 


1.5e-65 


231.2 


2 


152-240:270-364 


914 


Rlla 


Regulatory subunit of type 
II PKA R-subu 


4.8e-13 


56.8 


I 


25-62 


915 


DIL 


DIL domain 


6.6e-40 


146.0 


1 


214-323 


915 


PDZ 


PDZ domain (Also known 
as DHR or GLGF) 


2e-12 


54.7 


1 


rrc CIO 1 


916 


lipoxygenase 


Lipoxygenase 


3.3e- 
193 


655.3 


1 


1 O 1 CA O 

121-o4o 


916 


PLAT 


PLAT/LH2 domain 


L6e-29 


111.5 


1 


2-111 


917 


PLAT 


PLAT/LH2 domain 


1.6e-29 


111.5 


1 


2-111 


917 


lipoxygenase 


Lipoxygenase 


0.00053 


-342.4 


1 


91-294 


918 


PLAT 


PLAT/LH2 domain 


L6e-29 


111.5 


1 


2-111 


918 


lipoxygenase 


Lipoxygenase j 


4e-06 


-297.4 


I 


121-323 


926 


Aajtrans 


Transmembrane amino acid 
transporter protein 


i.3e- 
138 


473.9 


1 


114-517 


. 927 


EGF 


EGF-like domain 


5.8e-36 


132.9 


6 


29-57:60-88:95- 

128:i35-l7l:l78- 

209:216-247 


930 v 


DUF6 


Integral membrane protein 
DUF6 


0.00017 


28.3 


2 


8-129:147-277 


933 


Peptidase_M24 


metallopeptidase family 
M24 


2.1e-69 


244.0 


1 


87-326 j 


938 


PDZ 


PDZ domain (Also known 
as DHR or GLGF) 


1.8e-20 


81.4 


1 


93-174 


938 


L27 


L27 domain 


6.5e-16 


66.3 


1 


13-68 


940 


rrm 


RNA recognition motif. 


2.7e-46 


167.2 


4 


61-128: 18o-2j3:33y- 

Af\C-AZC <OA 


941 


EGF 


EGF-like domain 


1.9e-18 


74.7 


5 

— 


oo-y2:yy-ijj. ijy- 

171»17Q 91R*994-95Q 


941 


MAM 


MAM domain 


1.7e-13 


CO i 

Do. 3 


-~ 


401 ^47 


942 


EGF 


EGF-like domain 


1.9e-18 


HA 1 




71 07*104 118*144- 
/ l-y/.iUH-i Jo.i.*^H- 
1 78* 1 84-221*229-264 


942 


MAM 


MAM domain 


1.7e-13 


JO.J 


-j 

_j 


40R-<iS9 


943 


PHD 


PHD-finger 


2.9e-10 


An < 
4/.J 


-~ 


R-S-198 


943 


bromodomain 


Bromodomain 


o.ze-iu 


AC A 


-j 


14Q-91S 


943 


zf-MYND 


MYND finger 


7e-U/ 


1C 1 
30.3 




077-1011 I 


943 


PWWP 


PWWP domain 


7.5e-Uo 


32. y 


~i 

-j 




944 


PHD 


PHD-finger 


2.9e-10 


An c 
4/.D 


1 


R-\-198 
0^ I/O 


944 


bromodomain 


Bromodomain 


8.2e-10 


AC A 

4o.U 




1 40-91 S 


944 


PWWP 


PWWP domain 


7.5e-uo 


3Z.y 


"T 

-~ 


960-140 


945 


PHD 


PHD-finger 


2.9e-lU 


An < 
4/.!> 




198 


945 


bromodomain 


Bromodomain 


8.2e-10 


46.0 


i 


149-235 


945 


zf-MYND 


MYND finger 


7e-07 


36.3 




1U23-1UD / 


945 


PWWP 


PWWP domain 


7.5e-06 


32.9 




269-340 


946 


PHD 


PHD-nnger 


9 Q*» 10 

z.ye-iv 


47 S 




90-133 


946 


bromodomain 


Bromodomain 


8.2e-10 


46.0 




154-240 


946 


zf-MYND 


MYND finger 


7e-07 


36.3 




1028-1062 


946 


PWWP 


PWWP domain 


7.5e-06 


32.9 




274-345 


950 


ion trans 


Ion transport protein 


3.5e-19 


77.1 




345-518 


951 


Reproiysin 


Reproiysin (M12B) family 
zinc metalio 


3e-88 


306.6 




210-409 


951 


PepJV112B_pr 
opep 


Reproiysin family 
propeptide 


1.3e-31 


118.4 


i 


80-198 
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951 


disintegrin 


Disintegrin 


2.5e-23 


90.9 


1 


426-501 


953 


ank 


Ankyrin repeat 


2e-46 


167.6 


7 


151-183:184- 
215:216-248:250- 
282:283-328:329- 
361:362-401 


954 


interferon 


Interferon alpha/beta 
domain 


1.8e-17 


71.5 


1 


16-171 


956 


adh short 


short chain dehydrogenase 


1.3e-07 


21.8 


1 


31-188 


958 


acid phosphat 


Histidine acid phosphatase 


1.7e-58 


207.7 


} 


30-381 ! 


959 


serpin ! 


Serpin (serine protease 
inhibitor) 


6.4e- 
179 


607.8 




1-329 


960 


serpin 


Serpin (serine protease 
inhibitor) 


9.le- 
200 


677.1 


1 


47-397 


961 


serpin 


Serpin (serine protease 
inhibitor) 


3.2e- 
200 


678.5 


1 


47-397 


962 


serpin 


Serpin (serine protease 
inhibitor) 


1.2e- 
203 


689.9 


i 


47-397 


964 


Reprolysin 


Reprolysin (M12B) family 
zinc metallo 


5.8e-96 


332,2 


1 


232-426 


964 


Pep_M12B_pr 
opep 


Reprolysin family 
propeptide 


4.4e-41 


149.9 


1 




1 12-zZU 


964 


disintegrin 


Disintegrin 


2.5e-l)y 


44. J 




AAA ^17 1 


965 


Uteroglobin 


Uteroglobin family 


1.4e-05 


31.8 


i 


1-88 


966 


GDA1_CD39 


GDA1/CD39 (nucleoside 
phosphatase) family 


5.7e-92 


319.0 


1 

— 


AQ AQ1 


967 


Clq 


Clq domain 


C 1 ~. A A 

o.le-44 


1 <0 A 




71.909 


970 


ig 


Immunoglobulin domain 


1.6e-06 


35.1 




41-124:156-230 


970 


zf-CCHC 


Zinc knuckle 


5.7e-05 


30.0 




523-540 


971 


pentaxin 


Pentaxin family 


8.1e-22 


85.9 




281-479 


973 


bZIP 


bZIP transcription factor 


0.024 


19.0 




622-686 


974 


WD40 


WD domain, G-beta repeat 


0.003 


24.3 




37-72:77-113:122- 
156:211-247 


975 


ion trans 


Ion transport protein 


0.0031 


24.2 




248-408 


976 


ion trans 


Ion transport protein 


0.0031 


24.2 




322-482 


977 


zf-C2H2 


Zinc finger, C2H2type 


2.5e-55 


197.2 


35 


4-27:108-131:162- 

185:243-266:439- 

462:470-492:600- 

623:843-866:886- 

908:925-948:1030- 

1053:1114- 

1137:1193- 

1216:1265- 

1288:1312- 

1335:1369- 

1392:1470- 

1493:1515- 

1538:1577- 

Io00:loo0- 

1683:1697- 

1720:1767- 

1790:1846- 

1869:1892- 

1914:1968- 

1990:2051- 

2073:2085- 

2107:2114- 
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value 1 


Score 


Repeats 


Position 














2137:2143- 
2166:2251- 
2274:2280- 
2303:2314- 
2336:2360- 
2382:2388- 
241 l:/474~24yo 


980 


trypsin 


Trypsin 


7.9e-18 


72.7 


1 


ice iT/r 

155-320 | 


980 


PDZ 


PDZ domain (Also known 
as DHR or GLGF) 


8e-12 


52.7 


I 


332-427 


980 


kazal 


Kazal-type serine protease 
inhibitor domain 


3.7e-05 


30.6 


1 i 


£S1 1 IT 

63-117 


981 


asp 


Eukaryotic aspartyl protease 


8.1e- 
104 


358.3 


1 


19-421 


984 


Zn_carbOpept 


Zinc carboxypeptidase 


2e-ll4 


393.5 


1 


50-332 


985 


Zn carbOpept 


Zinc carboxypeptidase 


2e-ll4 


393.5 


1 


50-332 


986 


NifU_N 


NifU-like N terminal 
domain 


4.2e-80 


279.5 


1 


34-160 


988 


UPAR LY6 


u-PAR/Ly-6 domain 


l.8e-05 


31.6 


1 


28-110 


990 


zf-C2H2 


Zinc finger, C2H2 type 


l.4e-12 


55.2 


3 


a no. on i i H.iin 

53-78:87- 1 14: 12U- 
144 


991 


pkinase 


Protein kinase domain 


8.6e-90 


311.7 


1 


1A 111 

20-312 


992 


spectrin 


Spectrin repeat 


6.6e-26 


99.5 


7 


17-121:124-226:229- 
340:343-44y:4jZ- 

D DO. fO l-OOO.Oy 1-777 


994 


Clq 


Clq domain 


2.1e-31 


11 /.o 


i 
i 


1 &C\ OQ.A 


994 


Collagen 


Collagen triple helix repeat 
(20 copies) 


0.00022 


22.3 


1 


na lie 


995 


Allantoicase 


Allantoicase repeat 


8.7e- 
122 


A 1 O A 

418.0 


i 

2 




996 


ig 


Immunoglobulin domain 


4.9e-ll 


50.1 


3 


3 /-IJ 1. loZ-Z*f O.Z. / J- 

335 


997 


RasGEF 


RasGEF domain 


1.7e-88 


307.4 


~ 


QQA 1 1 OA 

yyy-l 1 84 


997 


RhoGEF 


RhoGEF domain 


8.2e-68 


1*JO *7 

23 0.7 




OAT AOS 
24 /-4Z5 


997 


PH 


PH domain 


2.3e-35 


130.9 




23-133:460-588 


997 


RasGEFN 


Guanine nucleotide 
exchange factor for Ras-1 


4.9e-18 


73.3 




o33-ooo 


997 


IQ 


IQ calmodulin-binding 
motif 


0.012 


22.2 




206-226 


999 


Ktetra 


K+ channel tetramerisation 
domain 


6e-31 


116.2 


1 


24-126 


1002 


PHD 


PHD-finger 


1.9e-17 


71.4 


1 


IOC T31 

185-233 


1002 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.00078 


26.2 




1 AO 1 C/C 

108-150 


1003 


WD40 


WD domain, G-beta repeat 


1.8e-24 


94.7 


6 


n£LQ QA1.ACQ 

7oo-o02:7jy- 
992:1070-1104:1110- 
1145:1151- 
1185:1191-1225 


1004 


zz 


Zinc finger, ZZ type 


4.6e-ll 


50.2 


1 


3-48 


1004 


zf-C2H2 


Zinc finger, C2H2 type 


0.012 


22.2 


1 


78-101 


1006 


C2 


C2 domain 


9.6e-05 


29.2 


1 


304-394 


1007 


IBN_NT 


Importin-beta N-terminal 
domain 


9.5e-28 


105.6 


1 


22-101 


1009 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


1.4e-35 


131.6 


1 


250-373 
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1009 


PH 


PH domain 


1.7e-14 


61.6 


1 


136-227 


1009 


ank 


Ankyrin repeat 


2e-ll 


C 1 A 

51.4 


2 


411-446:447-479 


1009 


SH3 


SH3 domain 


1.7e-10 


48.3 


1 


881-938 


1011 


ig 


Immunoglobulin domain 


l.2e-48 


175.1 


6 


80-148:183-242:281- 

342:379-440:474- 

535:570-634 


1015 


efhand 


EF hand 


3.7e-26 


100.3 


4 


29-57:65-93:102- 
130:138-166 


1018 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


3.7e-76 


266.4 


1 


87-350 


1019 


LRR 


Leucine Rich Repeat 


2.9*41 


150.5 


14 


82-105:106-129:133- 

157:158-181:182- 

205:206-229:251- 

272:329-352:377- 

399:403-426:427- 

AAA.A£.*i AOCCin 

444:463-486:537- 


1021 


RasGEF 


RasGEF domain 


le-47 


172.0 


1 


907-1092 


1021 


PDZ 


PDZ domain (Also known 
as DHR or GLGF) 


4.2e-17 


70.2 




580-661 


1021 


cNMP_binding 


Cyclic nucleotide-binding 
domain 


3.8e-13 


57.1 




345-435 


1021 


RA 


Ras association 
(RalGDS/AF-6) domain 


L3e-05 


32.1 




799-885 


1022 


RasGEF 


RasGEF domain 


le-47 


172.0 




OCT 1A/IO 


1022 


PDZ 


PDZ domain (Also known 
as DHR or GLGF) 


4.2e-17 


70.2 


1 


530-611 


1022 


cNMP_binding 


Cyclic nucleotide-binding 
domain 


3.8e-13 


57.1 


1 


295-385 


1022 


RA 


Ras association 
(RalGDS/AF-6) domain 


1.3e-05 


32.1 


1 


749-835 


1026 


Ricin_B_lectin 


QXW lectin repeat 


1.3e-ll 


52.1 


3 


1 J4-1 /z: lo/- j 
225:226-265 


1027 


SCF 


Stem cell factor 


2.4e- 

1 1 c\ 

119 


a r\r\ r\ 

409.9 


1 


i n i A 


1028 


cadherin 


Cadherin domain 


1.9e-75 


264.0 


4 


50-141:155-250:264- 

'\&&*'\ 70-4.70 


1029 


cadherin 


Cadherin domain 


1.4e-78 


274.5 


4 


50-141:155-250:264- 


1030 


PH 


PH domain 


1.2e-10 


48.8 


1 


522-624 


1031 


Renal_dipeptas 
e 


Renal dipeptidase 


1.3e-73 


258.0 




ca inn 


1032 


aa_permeases 


Amino acid permease 


3.9e-07 


-148.3 


1 


Af\ Anc 

40-475 


1032 


Aa_trans 


Transmembrane amino acid 
transporter pro 


0.0067 


-123.4 




A*\ A £t\ 

42-460 


1033 


FTHFS 


Formate-tetrahydrofolate 
lipase 


0 


1367.2 


1 


*\ s* c\ e\nc\ 

360-979 


1033 


THF DHG C 
YH 


Tetrahydrofolate 
dehydrogenase/cyclohyd 


1.5e-07 


21.3 




68-180 


1033 


THF DHG C 
YH C 


Tetrahydrofolate 
dehydrogenase/cyclohyd 


3.7e-05 


-45.5 




182-329 


1035 


RhoGEF 


RhoGEF domain 


9.1e-26 


99.0 




778-962 


1035 


PDZ 


PDZ domain (Also known 
as DHR or GLGF) 


4.2e-12 


53.6 




47-122 


1035 


PH 


PH domain 


0.081 


19.5 


1 


1006-1119 
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1037 


PH 


PH domain 


1 A 1 A 

2.4e-10 


An q 
4/.o 


1 


I /-lz4 


1037 


efhand 


EFhand 


2.7e-08 


41.0 


2 


138-166:174-202 ; 


1039 


7tmJ 


7 transmembrane receptor 
(rhodopsin family) 


3.9e-22 


87.0 


1 


40-289 


1040 


tsp_3 


Thrombospondin type 3 
repeat 


l.le-22 


88.9 


9 


404-418:440- 

A<A*A&1 AHH'AOQ 

513:522-536:537- 
551:560-574:600- 
614:615-627 


1040 


TSPN 


Thrombospondin N- 
terminal -like domain 


2.3e-05 


22.9 


1 


1-101 


1042 


PTR2 


POT family 


7.4e-85 


295.3 


1 


103-471 


1043 


FH2 


Formin Homology 2 
Domain 


4e-l05 


362.7 


I 


595-1038 | 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e- 
140 


478.8 


19 


114-136:142- 
164:170-192:198- 
220:226-248:254- 
276:282-304:3 10- 
3JZ:jjo-jou.joo- 
388:394-416:422- | 

AAA'A^d.AlO -/17R- 
^00-^06-^9 8 '^4- 

5^6-562-584*590- 
612-618-640 


1 f\A A 

1044 


T/T> A D 


JtvKAr) DOX 


6 At* 97 


109 q 


i 
i 


8-48 


1 f\A A 

1044 


Zt-nrSU 


BED zinc finger 


\J.\jyj 


11/. J 


9 


431-473-603-641 


1046 


T% A 

PA 


PA domain 






1 
I 


155-255 


t f\A O 

1048 


TIO 


lr 1/ 1 lO domain 


j. ye- d / 


709 6 


*\ 
J 


803-893*895- 
980:983-1092 


lU4o 


PQT 




7.4e-26 


99,3 


2 


468-519:759-801 


1 C\AQ 
Who 


Sema 


oema aomam 


1.6e-ll 


-3.7 


1 


34-449 


1049 


BTB 


BTB/POZ domain 


1.7e-26 


101.4 


1 


20-124 


1 A<A 




adL/ transporter 


7>7C J / 


135 5 


1 


26-217 


1 AO 


77 


Line ringer, LiL* lypc 


4.6e-ll 


50.2 


1 


3-48 


1051 


zf-C2H2 


Zinc finger, C2H2 type 


0.012 


22.2 


1 


78-101 


1052 




Immunoglobulin domain 


1 9p.1 1 


S9 9 


9 

A* 


34-1 10- 150-204 


1053 


CUB 


CUB domain 


2.5e-12 


54.4 


1 


156-260 


1053 


WSC 


WSC domain 


0.002 


18.6 


1 


,71-142 


1054 


ig 


Immunoglobulin domain 


A AAO/£ 


A 

Z4.4 


1 
1 


1 1 ^ 

jO- 1 1 J 


1055 


MHCJ 


Class I Histocompatibility 
antigen, domains 


2.4e- 

140 


499.6 


1 


25-203 


1055 


ig 


Immunoglobulin domain 


8.5e-08 


39.3 


1 


220-285 


1057 


LBPJ3PI_CE 
TP C 


LBP / BPI / CETP family, 
C-terminal do 


0.00076 


A O 


1 
1 


01 9 AAA 


1062 


PMP22__Claudi 
n 


PMP- 

22/EMP/MP20/Claudm 

lalllliy 


1.8*44 


161.2 


1 


4-181 


1064 


PDZ 


PDZ domain (Also known 
as DHR or GLGF) 


4.8e-71 


249.5 


5 


1-84:209-297:310- 
393:409-490:694-775 


1065 


pro 


Phosphotyrosine interaction 
domain (PTB/PID) 


l.le-44 


161.8 


1 


42-168 


1067 


pkinase 


Protein kinase domain 


2.8e-73 


256.8 


1 


12-272 


1068 


lipocalin 


Lipocalin / cytosolic fatty- 
acid binding pr 


5.6e-37 


136.3 


1 


38-185 


1069 


lactamase B 


Metallo-beta-lactamase 


3e-35 


130.6 


1 


7-172 
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Score 


Repeats 


Position 






b u{jci idiiiii y 










1070 
lv/U 




A nnpvin 




440 I 


4 


57-124-128-196*212- 
280:288-355 


1071 
IU/ 1 




*5 nA i 1 1 m • n <=i i r f\tvn n c tn i tt<*r 
OUU1U111.1 ICUlUtLaUbUULlt/l 

symporter family 


O 

V 


1202.5 


1 


44-574 


1079 




lilUllUllUglUUUll.ll vJ.Ullla.Hl 


0 0008 

v.vrv/V/O 


26.1 




38-122 


1073 


Glypican 


Glypican 


2.1e- 

901 


981.5 


1 


3-566 


1074 


PAP assoc 


PAP/25 A associated domain 


4.2e-12 


53.7 




490-549 


1 (Y7A 


rrm 


ixiN/v recogniiion room. 


7 9f»-0R 


J7.U 




JO 1*J 


1075 


Glycqjransf_2 

o 

y 


Glycosyltransferase family 

70 


3.6e-69 


243.2 




213-507 


1078 


A2M 


Alpha-2-macroglobulin 
family 


3.4e- 

1Q^ 


661.8 


1 


721-1469 


1078 


A2M_N 


Alpha-2-macroglobulin 
family N-terminal regi 


1.6e-88 


307.5 


1 


1-623 


1079 


A2M_N 


Alpha-2-macroglobulin 
iamny jn- terminal regi 


4.7e-90 


312.6 


1 


14-636 


1080 


A2M_N 


Alpha-2-macroglobulin 
iamny iN-ierminai regi 


l.Se-38 


141.5 


1 


1-563 


1081 


A2M 


Alpha-2-macroglobulin 
family 


1.3e- 
200 


679.9 




721-1469 


1081 


A2M_N 


Alpha-2-macroglobulin 
family N-terminal regi 


1.6e-88 


307.5 




1-623 


1082 


A2M_N 


Alpha-2-macroglobulin 
family N-terminal regi 


4<7e-90 


312.6 




1-623 


1083 


COesterase 


Carboxylesterase 


2.1e- 
155 


529.7 




6-547 


1084 


EGF 


EGF-like domain 


9.5e-90 


311.6 


18 


192-219:404- 

431:631-666:878- 

914:920-956:962- 

997:1003-1037:1043- 

1078:1084- 

1 1 19: 1 IZ5- 

1160:1166- 

1201:1207- 

1285:1291- 
1328:1429- 

1 A&fr 1 d79 
lH-Ou. I*f l jL- 

1507:1626- 
1661:1667-1706 


1 AO A 


TB 


IB domain 


i.oe-/o 


77A 1 




00 /-OlV/.UOO- 

729:1358-1401:1535- 

1 S77 


i ne£ 


tni 


— — : — — ; — 

Fibronectin type III domain 


j.ye-yj 




0 


SP»7*609-riR*v700- 

JO/ .UUZ-UOJ. /uu- 

786:802-888 


1086 


ig 


Immunoglobulin domain 


3e-24 


94.0 


4 


168-232:285- 

347:1133-1191:1349- 

1409 


1087 


zf-C2H2 


Zinc finger, C2H2 type 


4.6e-33 


123.3 


4 


161-183:189- 
211:217-239:245-267 


1087 


KRAB 


KRAB box 


l.9e-24 


94.6 


1 


14-54 


1088 


KRAB 


KRAB box 


1.9e-24 


94.6 


1 


14-54 


1088 


zf-C2H2 


Zinc finger, C2H2 type 


le-07 


39.1 


1 


161-183 
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IVlOUcI 


Description 


IT 

value 


Score 


Repeats 


rosition 


10K9 


Keratin R? 


jverann, mgn suiiur dz 
protein 




71 0 

1 l.U 


i 

i 




1090 


Keratin R2 


Keratin hidi cnlfiir R9 

rv.Cld.llIl, lUgll OUUUI OZr 

protein 


0 0059 

U.WJ7 




i 


7 7A 
z-/o 


1091 


Keratin R2 


Keratin fiicrli cnlfiir R? 

IVdOtlll, lUgll oUlLUI Dm 

nrotein 

gJL VLW111 


1 9e-06 


21 1 
^i.i 


? 


7 10R-1 1 1 ' 


1092 


abhydrolase 


alphafteta hydrolase fold 


l.6e-12 


55.0 


1 


111-390 


1093 


flhVivHr r>1 a 


alnha/Kpta nvHrrvlacfa "frvlrl 
aiyilal UCLa iijrUlUlttoC 1V/1U 


1 6e-17 


5S 0 


1 

1 


171 4^0 


1094 


7tm 3 


7 transmembrane receptor 


5.5e-08 


10.8 


1 


22-271 ! 


10Q6 

IU7V 


Ipftin f 


l^cuiin v^-iypc uuiiidiii 




Q7 7 


1 

1 


1 A A OAR 


1097 


lectin c 


Lectin C-type domain 


6.5e-27 


102.8 


1 


100-208 


1 AQB 


7tm 1 

/im_i 


/ iransrnem Drane receptor 
(rhodopsin family) 


i. /e-4 i 


K1 1 
IJI.J 


1 
1 


OQA 


1 AOO 

luyy 


PC A 


oca domain 


A AAA77 


77 7 


1 
I 


11CI AAH 


1100 


ig 


Immunoglobulin domain 


5e-ll 


50.1 


3 


146-203:245- 
zyj.jj i-4Uj 


1101 


An_peroxidase 


Animal haem peroxidase 


2.7e- 


658.9 


1 


726-1265 


1 1 A1 


ig 


— — : 

Immunoglobulin domain 


4.4e-jo 




A 


74R KM-IAA 
Z4o-jU/.j44- 

400:433-490:525-582 


1 1 ai 


T T>"D 


Leucine Rich Repeat 


1 1p 75 


07 1 


c 

J 


^1 74«7^ QR*QQ 

122:123-146:147-170 


1 1 ai 


T DDPT 


Leucine rich repeat C- 
terminal domain 


0>1„ 11 

o.4e-i i 


AO 1 


1 


1 RO 719 
1 617-Z OA 


1 1 ai 


VWC 


von Willebrand factor type 
C domain 


/e-uo 




1 
1 


ljyj-14JU 


1 1 ni 
11U1 


*TTT o 


1 lLa aomain 


A A71 


17 A 
1Z.U 


i 
1 


1 104-1447 
lji/4-144/ 


1102 


An_peroxidase 


Animal haem peroxidase 


2.7e- 

1QA 

iy4 


658.9 


1 


702-1241 


1 in? 

1 IUZ 


ig 


irnmunogioDuiin aomain 


4.4c- jO 


nil 
1 Jj.j 


A 
*t 


176-409-466- 501 -5 S 8 

J / V.*TV7>f- , TV7V7. J\J 1 J J O 


1102 


LRR 


Leucine Rich Repeat 


3.2e-21 


83.9 


4 


51-74:75-98:99- 
122-123-146 


1 11/ <£ 


T PRPT 

1_>1VL\.V_/ 1 


T .eiifMtif* rioli vf*r\f*c\t C^— 
l^vUl/lilv 11 vil ivjJC/cU v>- 

terminal domain 


0.*tG" 1 i 


49 ^ 

*T-^.-7 


i 

1 


156-208 


110? 


vwv 


vati ^A7illeHriinf1 far»trv tvne 
vuii vv iiicuiaiiu lavLUi 

Hnmnin 

u Ulliai.il 






1 
I 


1371-1426 


110? 


TTT .a 


TTT .a Hnmain 
x ii— 'U. uuuiaiii 


0.023 


12.0 




1370-1423 


1113 


pkinase 


Protein kinase domain 


3e-45 


163.8 


1 


194-468 


1117 
111/ 


ig 


inimunogiouuiui aomain 






A 

*T 


10-R7- 197-1 fi6-?R1- 
337:375-434 


1118 


ig 


Immunoglobulin domain 


0.00012 


28.9 


2 


42-98:136-195 


I 1 1Q 

I I ly 


TT3XT XTT 


Importin-beta N- terminal 
domain 


j.*fe-zj 


QA ^ 


i 


7R inn 


1 1 9A 
X 1ZU 


ank 


Ankyrin repeat 


/ . / e-z i 


R7 7 


z 


Q9A Q^9 QS1 ORS 


1 19A 
1 IZU 


QTTO 

on. j 


oil o aomain 


£ In K 

o.ie-io 


Oj. 1 


L 


1A99 1070 


n?? 




TPR Domain 


6 4e-09 


41 1 

*T«7. 1 


»7 


174-157- 158- 

1 Zi*T Uf .1JO 

191:192-225 


1124 


ank 


Ankyrin repeat 


2.9e-46 


167.1 


6 


31-63:64-96:97- 

129:130-162:163- 

195:196-228 


1125 


ank 


Ankyrin repeat 


3.4e-38 


140.3 


5 


31-63:64-96:97- 
129:130-162:163-195 


1129 


F5__F8jype_C 


F5/8 type C domain 


1.4e-54 


194.8 


1 


34-174 


1129 


laminin G 


Laminin G domain 


1.4e-07 


38.6 


1 


212-344 
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; 


1130 


F5 F8 . type_C 


F5/8 type C domain 


1.4e-54 


194.8 


1 


34-174 


1130 


laminin_G 


Laminin G domain 


3.1e-44 


160.4 




212-344:398- 

525:821-943:1046- 

1179 


1130 


EGF 


EGF-like domain \ 


9.1e-07 


35.9 




551-583:962-996 


1131 


Glycos_transf_ 
2 


Glycosyl transferase 


6.5e-31 


116.1 


1 


155-341 


1131 


Ricin B lectin 


QXW lectin repeat 


0.00059 


26.6 




467-507:558-596 


1133 


pkinase 


Protein kinase domain 


1.7e-48 


174.5 




11-347 


1135 


C2 


C2 domain 


l.le-42 


155.2 




7-88:135-216 


1135 


RasGAP 


GTPase-activator protein for 
Ras-like GTPase 


5.2e-34 


126.4 


1 


323-513 


1135 


PH 


PH domain 


5.8e-08 


39.9 


1 


567-673 


1135 


BTK 


BTK motif 


9.2e-05 


28.9 


_ 


675-711 


1137 


MAM 


MAM domain 


l.le-22 


88.9 




452-593 


1137 


EGF 


EGF-like domain 


3.5e-15 


63.9 




60-86:123-157:163- 
197:203-242:248-283 


1143 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


0.00045 


-23.4 


i 


1-173 


1144 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


2.2e-40 


147.6 


1 


40-287 


1147 


IL1 


Interleukin-1/18 


4.3e-21 


83.5 


1 


12-152 


1148 


filament 


Intermediate filament 
protein 


5e-101 


349.0 


1 


1-299 


1150 


MBOAT 


MBOAT family 


1.4e-06 


-27.4 




130-323 


1151 


filament 


Intermediate filament 
protein 


2.1e- 
116 


400.1 


{ 


131-412 


1152 


Peptidase_M10 


Matrixin 


4.4e-84 


292.8 


i 


36-202 


1152 


hemopexin 


Hemopexin 


6e-37 


136.2 




231-273:275- 
317:322-369:371-411 


1153 


Peptidase_M10 


Matrixin 


4.4e-84 


292.8 




36-202 


1153 


hemopexin 


Hemopexin 


6e-37 


136.2 




231-273:275- 
317:322-369:371-41 1 


1155 


LBP BPI CE 
TP C 


LBP/ BPI /CETP family, 
C-terminal do 


3.1e-30 


113.9 


' 


242-478 

i 


1155 


LBP BPI CE 
TP 


LBP /BPI /CETP family, 
N-terminal do 


3.3e-22 


87.2 


1 


26-240 


1156 


HMGJjox 


HMG (high mobility group) 
box 


3.1e-31 


117.2 


1 


85-153 


1159 


DNAJigase 


ATP dependent DNA ligase 
domain 


3.7e-57 


203.3 


1 


480-645 


1159 


zf-PARP 


Poly(ADP-ribose) 
polymerase and DNA- 
Ligase 


8.5e-52 


185.5 


1 


93-185 


1160 


serpin 


Serpin (serine protease 
inhibitor) 


7.7e- 
150 


511.2 


1 


3-425 


1167 


ig 


Immunoglobulin domain 


3.4e-16 


67.2 




42-9o: 1 3 j- ly /:/3 /- 
297 


1169 


lectin c 


Lectin C-type domain 


2e-18 


74.6 




131-231 


1171 


WD40 


WD domain, G-beta repeat 


4.4e-80 


279.5 


8 


224-260:280- 
316:321-357:363- 
398:404-440:446- 1 
491:497-533:539-574 


1172 


MBOAT 


MBOAT family 


1.6e-08 


6.7 


1 


488-777 


1172 


ig 


Immunoglobulin domain 


2.9e-08 


40.9 


2 


42-99:139-198 
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1173 


MBOAT 


MBOAT family ^ 


1 /Za. AO 

l.oe-Uo 


O. / 


l 


a on nnn 


1173 


ig 


T 111* 1 - - 

Immunoglobulin domain 


O On AO 




o 

L 


4z-yy.i3y-iyo _^ 


1174 


MBOAT 


MBOAT family i 


5.1e-65 


229.4 




130-373 


1175 


PTE 


Phosphotnesterase family 


1 A ^ AA 

1 .4e-9U 


11 A A 

314.4 


J 


6-233 


1183 


PS_Dcarbxylas 
e 


Phosphatidylserine 
decarboxylase 


0.3e-4z> 


i/co /C 

loZ.o 




232-467 


1184 


TSC22 


TSC-22/dip/bun family 


1 A A 

1.3e-4U 


1 /tO A 


— 


124- lo3 


1188 


DPPIV_N_Jer 
m 


.'11 > • t 11 / 

Dipeptidyl peptidase IV 
(DPP IV) N-termi 


C 1 AO 


*71 *7 

-71./ 




111 £Ort 

132-680 


1188 


Peptidase S9 


Prolyl ohgopeptidase family 


1 *7« A/C 

1.7e-Uo 


QC A 
33.0 




/CO/1 *7/C/1 

Oo4- /04 


1189 


DPPIV_Njer 
m 


Dipeptidyl peptidase IV 
(DPP IV) N-termi 


5.1e-08 


-71.7 




132-680 


1189 


Peptidase_S9 


Prolyl oligopeptidase family 


1.7e-06 


35.0 




684-764 


1190 


DPPIVJNter 
m 


Dipeptidyl peptidase IV 
(DPP IV) N-termi 


3.8e-07 


-94.7 




132-667 


1190 


Peptidase_S9 


Prolyl oligopeptidase family 


L7e-06 


35.0 




671-751 


1191 


Ribosomal S2 
5 


S25 ribosomal protein 


6.5e-66 


232.4 




1-100 


1193 


ank 


Ankyrin repeat 


1.2e- 
239 


809.5 


27 


49-81:82-114:115- 

147:148-180:181- 

213:214-246:247- 

279:280-313:314- 

346:347-379:380- 

412:431-463:464- 

496:497-557:558- 

591:593-625:626- 

S~ c r\ SSf\ /rtl 

658:660-692:696- j 
728:729-761:762- 
797:798-827: o3U- 
864:865-897:898- 
y3 1 .y3z-yo*t.yoo- 

1000 


1194 


trypsin 


Trypsin 


2.5e-18 


74.3 




100-34Z 


1196 


vwc 


von Willebrand factor type 
C domain 


0.043 


12.4 




50-105:108-163 


1197 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


i.2e-28 


108.6 


1 


46-295 


1198 


MethyltransfD 
12 


D12classN6 adenine- 
specific DNA met 


0.0057 


-49.7 


I 


30-153 


1199 


lipocalin 


Lipocalin / cytosolic fatty- 
acid binding pr 


1.3e-22 


88.6 




10 1 t< 
3z-l /O 


1200 


tRNA-synt_2 


tRNA synthetases class II 
(D,KandN) 


7.4e-91 


315.3 




13D-4/3 


1200 


tRNA_anti 


OB-fold nucleic acid 
binding domain 


7.3e-ll 


49.5 




44- 1 i 0 


1202 


FAD_binding_ 
L 


FAD binding domain 


8.6e-09 


-83.1 


1 


C 1 /CO 

5- 1 62 


1203 


RasGEF 


RasGEF domain 


1.9e-16 


68.1 




211-412 


1204 


KH-domain 


KH domain 


1.9e-50 


181.0 




17-63:101-150:265- 
313 


1206 


transket_pyr 


Transketolase, pyridine 
binding domai 


4e-74 


259.7 




14-191 


1206 


transketolase 
C 


Transketolase, C-terminal 
domain 


5e-55 


196.2 




208-331 


1207 


Calsequestrin 


Calsequestrin 


1.7e- 


1001.7 




1-390 
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297 








1210 


ig 


Immunoglobulin domain 


l.le-13 


58.9 


2 


35-112:154-228 


1213 


cadherin 


Cadherin domain 


9e-81 


281.8 


6 


33-126:140-235:249- 

O A 1 <■> fl A AO A 

343:357-448:462- 
558:576-667 


1214 


calreticulin 


Calreticulin family 


2.7e- 
206 


698.7 


1 


21-317 


1221 


Osteopontin 


Osteopontin 


4.6e- 
173 


588.4 


1 


1-279 


1222 


serpin 


Serpin (serine protease 
inhibitor) 


2.4e- 
155 


529.5 


1 


OA AA1 

80-443 


1223 


ifi 


Immunoglobulin domain 


A 0_ 1 C 

4.8e-15 


/Z1 A 


Z 




1225 


DNA topoisol 
V 


DNA gyrase/topoisomerase 
IV, subunit A 


3.7e- 
180 


611.9 


I 


653-1120 


1225 


DNA_gyraseB 


DNA gyrase B 


1.3e-56 


201.6 


1 


210-370 


1225 


HATPase_c 


Histidine kinase-, DNA 
gyrase B-, andH 


L8e-13 


58.2 


1 


16-164 


1226 


AMP-binding 


AMP-binding enzyme 


3.6e-80 


279.7 


1 


105-539 


1227 


PCI 


PCI domain 


0.016 


18.5 


1 


26-117 


1228 


Clq 


Clq domain 


5.9e-45 


162.8 


1 


73-202 


1230 


ank 


Ankyrin repeat 


3.6e- 
215 


728.2 


28 


7-39:40-72:86- 

147:148-180:181- 

213:214-246:247- 

279:280-312:313- 

346:347-379:380- 

412:413-445:464- 

496:497-529:530- 

590:591-621:626- 

658:659-691:693- 

725:729-761:762- 

794:795-827:832- 

862:864-897:899- 

931:932-965:966- 

AAO.IAAO 1A1/I 

99o:100z-lUJ4 


1231 


LBP BPI CE 
TP C 


LBP/ BPI /CETP family, 
C-terminal do 


9.4e-24 


92.3 


1 


242-470 


1231 


LBP_BPI_CE 
TP 


LBP /BPI /CETP family, 
N-terminal do 


3.3e-22 


87.2 


1 


26-240 


1232 


LBP_BPI_CE 
TP C 


LBP /BPI /CETP family, 
C-terminal do 


3.1e-22 


87.3 


1 


242-470 


1232 


LBPJ3PIJDE 
TP 


LBP /BPI /CETP family, 
N-terminal do 


3.3e-22 


87.2 


1 


26-240 


1233 


LBP BPI CE 
TP C 


LBP /BPI /CETP family, 
C-terminal do 


9.4e-32 


118.9 


1 


242-4/0 


1233 


LBP BPI CE 
TP 


LBP /BPI /CETP family, 
N-terminal do 


3.3e-22 


87.2 


I 


26-240 


1237 


ig 


Immunoglobulin domain 


z.oe-ju 


1 \d ft 


-x 
j 


28-86 127-184-219- 
277 


1237 


fh3 


Fibronectin type III domain 


2.6e-28 


107.5 


2 


299-385:396-481 


1238 


Nuf2 


Nuf2 family 


8.7e- 
104 


358.2 


I 


1-148 


1240 


Sema 


Sema domain 


2.2e- 
177 


602.7 


1 


59-496 


1243 


rrm 


RNA recognition motif. 


0.05 


15.7 


1 


17-93 


1247 


EGF 


EGF-like domain 


4.8e-56 


199.6 


17 


105-135:148- 
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• 








178:191-221:234- 
264:277-307:320- 

1 C r\ *> /" a ~\ r\ /* Af\f\ 

350:364-396:409- 
439:452-482:495- 
j25:53oo68:56l- 
oi l:oz4-ojo:ooy- 
699:712-742:755- 
/oj. /ys-oZo 


1249 


IBR 


IBR domain 


0.069 


5.8 


2 


36-104:111-167 


1951 


/\a llculo 


1 1 dllbUlCiIlUrdllC duiiriu dOlU 

transporter protein 


o . / e-no 


100. o 






19*59 


Ail tfQTlC 

/\d.__llallk> 


i roubmcuiuraiie amino duiu 

fT£JTicfi/\i*tpr nrnfpin 
UcUlopi/iLCl LflUlCIIl 


1 1p-65 


911 A 
Zj 1.4 




^1^ A 1 O 


1254 


FGF 


Fibroblast growth factor 


1.7e-37 


138.0 


i 


36-166 


1255 


r rr 


J^CUvlIlC IvlLll jxepedi 




OA Q 




4y-/u. / i-y^.y4- 
115:116-137 


1956 


RPF65 


ivciiiidi pigment epnneiiai 

mpmVirs*TiP rimf pin 
iiiciiiui cuic pi vie ill 


* Ca 81 


780 O 






1257 


RPE65 


Retinal pigment epithelial 

mpmt"ir!inp nrrttpin 
lUCLllUi alic JJiULClll 


4.7e-82 


286.0 




24-561 


1258 


ig 


Immunoglobulin domain 


3.1e-15 


64.1 




39-97:128-189 




serpm 


oerpin ^serine protease 
inhibitor) 


1 Qp 56 


oon o 
zuu.y 




ZJ-4ZO 


1261 


all 


/vLyr ""nuos yidu on lacior 

-LaiJ.nl jr 


7 Qp fiQ 

/ .ye- w 


-6 8 
-o.o 




0 1 89 


1264 


PAP2 


PAP2 superfamily 


3e-ll 


50.8 


i 


95-241 


1265 


SRCR 


Ovavciigci receptor 
cysteine-rich domain 


1 1p- 

i.je- 
128 


440 7 




1^ 1 98* 1 ^6-997»919 

329:360-459:477-574 


1266 


SRCR 


^Ipavpnorpr rprpntnr 

cvsteine-rich domain 


1 1p- 
128 


AAO 7 


j 


^-198- 116 997-919 
129-160-459-477-574 


1270 


Armadillo se? 

iUiUUUlllv Ovg 


A rm a di 1 1 o/hpta-patpnin-l i Vp 

repeat 


1 4e-05 


19 0 


A 
*t 


51-91-546-586-611- 
673:675-716 


1273 


nkinase 


Protein Icina^p domain 


Re-77 


268 6 


1 
1 


101-187 

luJ"JO / 


1275 


Reprolysin 


Reprolysin (M12B) family 

^inr mptnlln 


3e-88 


306.6 


1 


227-426 


1275 


PepJM12B_pr 
opep 


Reprolysin family 
nronentide 


1.3e-31 


118.4 


1 


97-215 


1275 


disintegrin 


Disintegrin 


2.5e-23 


90.9 


1 


443-518 


1277 


ank 


Ankyrin repeat 


2.6e-17 


70.9 


2 


301-339:340-373 


1278 


Ppntidacp A/T 1 
i vpLiuaoC/ ivi. i 


rcpilUdoC Xdulliy IVJLl 


9 6p- 
z.oe- 

1 19 


186 5 


i 
i 


Q8 *.n6 


1284 


Aa_trans 


Transmembrane amino acid 

tran snorter nrotpin 

U. OllOpUI Lwl ^1 UlA/111 


*1.4e-31 


118.3 


1 


4-407 


1285 


UPF0083 


Uncharacterised protein 
familv niPFOOSlI 


1.9e-05 


14.5 


1 


73-213 


1288 


LRR 


Leucine Rich Repeat 


1.3e-23 


91.9 


7 


66-89:90-113:114- 

137:138-161:163- 

186:187-210:211-233 


1288 


ig 


Immunoglobulin domain 


2.7e-07 


37.7 


1 


314-372 


1288 


LRRCT 


Leucine rich repeat C- 
terminal domain 


5.6e-05 


30.0 


1 


252-297 


1290 


LRR 


Leucine Rich Repeat 


2.2e-12 


54.6 


3 


61-84:85-108:110- 
132 


1291 


DAGKc 


Diacylglycerol kinase 
catalytic domain 


0.063 


-14.5 


1 


74-220 
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1292 


»? 


Immunoglobulin domain 


6.7e-10 


46.3 


2 


48-124:161-219 


1 OQ1 

izyj 


ig 


Immunoglobulin domain 


o. /e-iu 




Z 


AO iivi.i^t iin 

45-124:161-219 


1295 


Clq 


Clq domain 


1.4e-48 


174.8 


1 


72-198 


1296 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


O OA 

2.5e-24 


94.3 


1 


49-332 


1297 


Plectin 


Plectin repeat 


1.6e-86 


300.8 


6 


2734-2778:2808- 

2852:2897- 

2939:3003- 

3042:3043- 

3087:3119-3163 


1297 


CH 


Calponin homology (CH) 
domain 


1.6e-72 


254.3 


2 


213-316:329-433 


1297 


spectrin 


Spectrin repeat 






1 


OOA AAA 


i aao 

1298 


Plectin 


Plectin repeat 


l.oe-oo 


-lf\f\ Q 

3UU.O 


/r 
0 


T7A^ O^AA.OOOA 

z74o-z/yO:zo2U- 
2864:2909- 
zyj 1:3U1D- 
3054:3055- 


1298 


CH 


Calponin homology (CH) 
domain 


3.1e-69 


243.4 


2 


213-328:341-445 


1298 


spectrin 


Spectrin repeat 


0.029 


8.2 


1 


901-1006 




MAM 


MAM domain 


<; Co AQ 


17A 1 
1 /0. 1 




A99 ^Q^ 


1306 


ig 


Immunoglobulin domain 


5.4e-18 


73.2 




26-93:132-191:228- 

9R7 

. Zo / 


1308 


Acyl-CoA_dh 


Acyl-CoA dehydrogenase, 
C-terminal doma 


1.6e-49 


178.0 


1 


618-769 


1308 


Acyl- 

Co A an M 


Acyl-CoA dehydrogenase, 
middle domain 


1.4e-06 


15.3 


1 


505-614 


1309 


Acyl-CoA_dh 


Acyl-CoA dehydrogenase, 
C-terminal doma 


1.6e-49 


178.0 


1 

— 


600-751 1 


1 o An 
1309 


Acyl- 

uoa an M 


Acyl-CoA dehydrogenase, 
middle domain 


i a*. r\< 


1 J.J 




Aon cqz: 


1^11 
LD I 1 




Til Anltvi A/lit lm ninnmrT 

L\i caimoaunn-Dinamg 
motif 




97 9 
Z /.Z 


9 
z 


71 S_7^S-7^R-7^R 


1312 


SAM 


SAM domain (Sterile alpha 
motif) 


3.9e-13 


57.1 


2 


304-369:382-446 


1314 


HECT 


HECT-domain (ubiquitin- 
transf erase) 


5.3e- 
196 


664.5 


1 


2002-2309 


1315 


PAP2 


PAP2 superfamily 


7.8e-28 


105.9 


1 


56-218 


1316 


PAP2 


PAP2 superfamily 


l.6e-32 


121.5 


1 


88-236 


1317 


ig 


Immunoglobulin domain 


2.7e-07 


37.6 


1 


41-116 


1321 


LRR 


Leucine Rich Repeat 


1.9e-66 


234.2 


20 


145-168:169- 

194:195-217:240- 

265:266-285:287- 

310:311-336:337- 

356:358-381:382- 

4n7'dfi8 AOl-dOQ 
HKJ 1 .HUO-HZ / .*rZ>- 

452:453-478:479- 
498:500-523:524- 
549:550-569:571- 
594:595-620:621-644 


1321 


LRRNT 


Leucine rich repeat N- 
terminal domain 


0.0027 


24.4 


1 


115-143 


1322 


ig 


Immunoglobulin domain 


3.6e-14 


60.5 


3 


34-120:157-215:267- 
321 



WO 2004/080148 



PCT/US2003/030720 



489 
TABLE 4A 



SEQ 
ID 


Model 


Description 


E- 
value 


Score 


Repeats 


DnrUinn 

rOSlilon 


1323 


ig 


Immunoglobulin domain 


7.8e-06 


oo o 

32.8 


3 


1A 1 OA- 1 ^o - 01<.0<7 

J4-1ZU. 15 /-zl5.Zo/- 
111 


1324 


tsp_l 


Thrombospondm type 1 
domain 


A AflAl n 


0*7 1 


i 
I 


17 81 


1328 


SRCR 


Scavenger receptor 
cysieine-ncn Qornain 


i.De- 

171 


JOJ.D 


c 
J 


14 1 1 1 -1 88 98-M00- 
197*405-503*638-730 


133 1 


etnana 


nr nana 


1 Sp-06 
1. JG-UO 


1S 9 


1 

O 


12-40-48-76*85-113 i 


1 OOO 

1333 


wnt 


wnt family 


D.oc- 

205 


604 1 


1 
1 


40-365 


133o 


zi-Mlz, 


mix zinc ringer 


1 9p-19 






323-375 


1336 


SAP 


SAP domain 


2.4e-05 


31.2 


1 


11-45 


1337 


FA desaturase 


Fatty acid desaturase 


2.1e-76 


267.3 




71-296 


1338 


Retrotrans gag 


Retrotransposon gag protein 


A f\Q*7 

u.uy / 


O./ 


~ 


oaa inn 


1340 


actin 


Actin 


l.ye-oi 


O 1 7 ^ 

Zl /.J 




1 1£7 


1341 


ion trans 


Ion transport protein 




ZZ.j 


-j 


I i 7 O AO 

I I /-jUZ 


1343 


fh3 


Fibronectin type III domain 


7.3e-33 


122.6 


2 


394-480:492-578 


1343 




Immunoglobulin domain 


l.le-23 


AO 1 

92.1 


o 
3 


IZ4-1BZ.ZZ4- 

281:316-372 


1344 


ig 


Immunoglobulin domain 


5e-56 


199.5 


6 


53-110:150-216:255- 

n A.i^A/H7«ii<A 
31U.JJVJ-41 /.*oo- 

516:553-617 


1344 


MAM 


MAM domain 


1.3e-52 


188.2 


1 


753-918 


1345 




Immunoglobulin domain 


5.9e-05 


29.9 


1 


10O-Z55 


1345 


kazal 


Kazal-type serine protease 
inhibitor domain 


0.00028 


27.6 


1 


121-168 


1348 


ig 


Immunoglobulin domain 


3.4e-5i 


1 oo c 

183.5 


6 


Ol-lzU. 155-Z14.Z55- 

315:348-404:440- 
497:530-596 


1348 


fn3 


Fibronectin type III domain 


A A _ Af\ 

4.4e-40 


14o.o 


A 

4 


/CI ^ 7A4-71 7 
015-/U4. / 1 /- 

807*8 1 Q-Q07-Q1 0- 

1002 


1350 


serpin 


Serpin (serine protease 
inhibitor) 


1 O^ 

J.ze- 
205 


£OC O 


1 
1 


yliC /1AO 


1353 


CARD 


Caspase recruitment domain 


1 In OO 

1.3e-3z 


IZl.O 


1 
1 


9-01 

z-y i 


1355 


ank 


Ankyrin repeat 


Lle-45 


165.2 


6 


31-63:64-96:97- 

1 9Q« 110.1 69 -161- 
izy. i ju- luz. iuj- 

10*; -106-998 


1356 


p kinase 


: — 7~. j ; 

Protein kinase domain 




99S 9 

ZZJ.Z 


1 

1 


991-470 


1359 


tRNA-synt_l 


tRNA synthetases class I (1, 
L,MandV) 


l. ie-UD 


914 4 
-Zl*t.*t 


1 
1 


Jl'JOJ 


1360 


MHCJIJbeta 


Class II histocompatibility 
antigen, beta 


1 . /e-4 1 


1 J 1. j 


1 

1 


41-1 17 


1363 


ig 


Immunoglobulin domain 


1 1 _ AO 

l.le-Uo 


/19 1 
4Z..3 


o 

j 


1 14-900*91^- 
1 IH-Zvl/.ZJO- 

294:344-398 


1364 


Tissue fac 


Tissue factor 


0.069 


-126.3 


i 


1-271 


1364 


fii3 


Fibronectin type III domain 


0.095 


14.9 


i 


35-125 


1365 


IL1 


Interleukin-1 / 18 


7.6e-30 


112.6 


i 


11-155 


1366 


A2M 


Alpha-2-macroglobulin 
family 


le-210 


713.4 


i 


722-1449 


1366 


A2M_N 


AIpha-2-macroglobulin 
family N-terminal regi 


4.7e-90 


312.6 


i 


1-623 


1368 


UPAR LY6 


u-PAR/Ly-6 domain 


6.8e-37 


136.0 


i 


27-106 
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685 


Guanylin 


Guanylin precursor 


0.72 


1.2 


1 


1-27 


685 


hormone 


Somatotropin hormone family 


6.7e-18 


49.4 


1 


9-57 


685 


DUF756 


Domain of unknown function 
(DUF756) j 


0.4 


5.1 


1 


99-125 


686 


Guanylin 


Guanylin precursor 


0.72 


1.2 


1 


1-27 


686 


hormone 


Somatotropin hormone family 


3.6e-56 


157.9 


1 


9-151 


686 


PI3 PI4 kinase 


Phosphatidylinositol 3- and 4-kinase 


0.97 


3.4 


1 


172-206 


688 


hormone 


Somatotropin hormone family 


1.5e-68 


192.9 


1 


9-151 


689 


serpin 


Serpin (serine protease inhibitor) 


3.2e-21 


71.8 


1 


49-156 


689 | 


serpm 


Serpin (serine protease inhibitor) 


5.2e-57 


193.9 




160-397 


690 


PH 


PH domain 


0.042 


8.1 


1 


1-20 


690 


efhand 


EF hand 


9.2e-05 


21.0 


1 


34-62 


690 


efhand 


EF hand 


0.0023 


15.8 




70-98 


690 


PI-PLC-X 


Phosphatidylinositol-specificphospho 


5.9e-17 


60.5 


4 — 


187-222 


691 


Lipase_3 


Lipase (class 3) 


6.9e-18 


63.4 




366-505 


691 


Desulfoferrodox 


Desul f oferrodoxin 


0.9 


2.2 




528-533 


692 


PH 


PH domain 


4.7e-05 


17.9 


- — 


20-127 


692 


DUF482 


Protein of unknown function, DUF482 


0.8 


2.7 




50-67 


692 


Phage TAC 


Phage tail assembly chaperone 


0.21 


5.3 


i 


225-245 


692 


Glyco hydro 31 


Glycosyl hydrolases family 3 1 


0.8 


0.9 


i 


344-379 


692 


NHL 


NHL repeat 


0.25 


8.5 


i 


494-509 


692 


EspB 


Enterobacterial EspB protein 


0.27 


2.1 


i 


560-578 


694 


GDA1_CD39 


GDA1/CD39 (nucleoside phosphatase) 
fa 


L6e-55 


187.0 


i 


93-332 


694 


Pox-GppA 


Ppx/GppA phosphatase family 


0.4 


3.5 


i 


249-261 


694 


GDA1_CD39 


GDA1/CD39 (nucleoside phosphatase) 
fa 


5.1e-05 


15.7 




430-480 


695 


7tm 1 


7 transmembrane receptor (rhodopsin f 


8.1e-28 


82.0 


\ — 


22-294 


695 


GSPU N 


Bacterial type II secretion system pr 


0.41 


3.4 




110-118 


695 


GASA 


Gibberellin regulated protein 


0.72 


0.6 


j — 


176-197 


696 


DUF716 


Family of unknown function (DUF716) 


0.93 


3.4 




45-73 


696 


DcuC 


C4-dicarboxylate anaerobic carrier 


0.4 


4.3 


i 


46-67 


696 


FLOLFY 


Fioricaula / Leafy protein 


0.22 


2.7 


i 


146-159 


696 


lectin c 


Lectin C-type domain 


1.9e-07 


31.5 


i 


181-286 


696 


Rubella E2 


Rubella membrane glycoprotein E2 


0.95 


1.4 


i 


284-312 


698 


CDtoxinC 


Cytolethal distending toxin C 


0.43 


3.9 




9-33 


698 


GDA1 CD39 


GDA1/CD39 (nucleoside phosphatase) 
fa 


1.6e-62 


210.7 


i — 


40-275 


698 


GDA1_CD39 


GDA1/CD39 (nucleoside phosphatase) 
fa 


0.016 


7.2 


2 


376-393 


700 


zf-MYND 


MYND finger 


0.39 


5.1 


1 


173-192 


700 


Ribosomal L44 


Ribosomal protein L44 


0.33 


5.8 


1 


183-208 


700 


ZZ 


Zinc finger, ZZ type 


0.0003 


17.8 


1 


184-211 


700 


PilP 


Pilus assembly protein, PilQ 


0.028 


8.4 


1 


228-244 


700 


mvh DNA- 

LI 1 Vis L ^ £ \. 

binding 


Myb-like DNA-binding domain 


2.6e-09 


37.1 


I 


231-278 


700 


RRS1 


Ribosome biogenesis regulatory protei 


0.85 


3.5 




379-390 


701 


sigma70_ner 


Sigma-70, non-essential region 


0.45 


3.2 




616-628 


702 


zf-ANl 


ANl-like Zinc finger 


0.032 


10.1 




13-52 


702 


zf-ANl 


AN 1 -like Zinc finger 


9.2e-06 


22.6 




103-135 


703 


CRAL TRIO N 


CRAL/TRIO, N-terminus 


3.8e-13 


44.7 




3-71 


703 


DnaJ C 


DnaJ C terminal region 


0.054 


8.2 




8-20 


703 


CRAL TRIO 


CRAL/TRIO domain 


1.4C-44 


151.9 




85-244 


704 


Adrenomedullin 


Adrenomedullin 


0.82 


2.4 




142-167 
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704 


Rhomboid 


Rhomboid family 


1.6e-14 


55.3 


1 


201-304 


705 


TRAP alpha 


Transiocon-associated protein (TRAP), 


0.41 


3.2 




413-434 


705 


GKAP 


Guanylate-kinase-associated protein ( 


2.7e- 
292 


981.2 


~\ 


621-979 


705 


PLRV ORF5 


Potato leaf roll virus readthrough pr 


0.13 


4.1 


1 


752-766 


705 


DUF887 


Eukaryotic protein of unknown functio 


I 


2.6 


1 


797-815 


705 


CYTH 


CYTH domain 


0.26 


6.4 


1 


816-858 


705 


SeqA 


SeqA protein 


038 


3.6 


1 


824-837 


706 


LBP BPI CETP 


LBP / BPI / CETP family, N-terminal d 


4.5e-36 


123.6 




33-191 


706 


ABG transport 


AbgT putative transporter family 


0,27 


2.7 


-\ 


196-205 


706 


LBP BPI CETP 
JO 


LBP / BPI / CETP family, C-terminal d 


8.3e-l4 


49.9 


1 


253-456 


706 


HS2ST 


Heparan sulfate 2-O-sulfotransferase 


0.21 


4.8 


1 


309-338 


707 


Phage_integr_N 


Phage integrase, N-terminal SAM-like 


0.36 


5.2 


1 


103-121 


707 


Glyco transf 8 


Glycosyl transferase family 8 


0.00044 


15.9 


I 


268-340 


708 


LIM 


LIM domain 


9.7e-16 


57.8 


1 


13-69 


708 


zf-HIT 


HIT zinc finger 


0.57 


6.9 


1 


55-65 


709 


DUF572 


. Family of unknown function (DUF572) 


1.9e- 
204 


689.4 


1 


1-376 


710 


Collagen 


Collagen triple helix repeat (20 copi 


1.6e-14 


56.8 


1 


67-126 


710 


Collagen 


Collagen triple helix repeat (20 copi 


3.6e-08 [ 


32.9 


2 


127-174 


710 


Collagen 


Collagen triple helix repeat (20 copi 


4.1e-07 


29.0 


3 


183-232 


710 


Collagen 


Collagen triple helix repeat (20 copi 


0.25 


7.3 


4 ; 


237-254 


710 


Collagen 


Collagen triple helix repeat (20 copi 


4.4e-ll 


43.9 


6 


293-346 


710 


Collagen 


Collagen triple helix repeat (20 copi 


6.4e-07 


28.2 


7 


359-389 


710 


Collagen , 


Collagen triple helix repeat (20 copi 


0.42 


6.4 


8 


400-418 


710 


Collagen 


Collagen triple helix repeat (20 copi 


0.00074 


16.8 


9 


423-448 


710 


Collagen 


Collagen triple helix repeat (20 copi 


8.6e-08 


31.5 


10 


451-483 


710 


Collagen 


Collagen triple helix repeat (20 copi 


Ue-11 


46.2 


11 


493-550 


710 


Collagen 


Collagen triple helix repeat (20 copi 


6.8e-06 


24.4 


12 


556-593 


710 


Collagen 


Collagen triple helix repeat (20 copi 


0.0014 


15.7 


13 


595-622 


710 


Collagen 


Collagen triple helix repeat (20 copi 


1.8e-06 


26.6 


14 


624-659 


710 


Collagen 


Collagen triple helix repeat (20 copi 


4.1e-12 


47.8 


15 


684-743 


710 


Collagen 


Collagen triple helix repeat (20 copi 


2.4e-05 


22.3 


16 


744-774 


710 


Collagen 


Collagen .triple helix repeat (20 copi 


2e-ll 


45.2 


17 


781-829 


710 


Collagen 


Collagen triple helix repeat (20 copi 


0.00026 


18.5 


18 


830-859 


710 


Collagen 


Collagen triple helix repeat (20 copi 


8.1e-15 


57.9 


19 


860-919 


710 


Collagen 


Collagen triple helix repeat (20 copi 


2e-12 


48.9 


20 


920-979 


710 


Collagen 


Collagen triple helix repeat (20 copi 


3.5e-06 


25.5 


21 


1000- 
1031 


710 


Collagen 


Collagen triple helix repeat (20 copi 


1.9e-ll 


45.2 


22 


1033- 
1090 


710 


Collagen 


Collagen triple helix repeat (20 copi 


6.6e-ll 


43.2 


23 


1099- 
1154 


710 


Collagen 


Collagen triple helix repeat (20 copi 


3.9e-13 


51.6 


24 


1155- 
1214 


710 


Collagen 


Collagen triple helix repeat (20 copi 


0.0069 


13.1 


25 


1217- 
1234 


710 


HerpesJJP 


Herpesvirus leader protein 


0.94 


2.5 


1 


1228- 
1243 


710 


Collagen 


Collagen triple helix repeat (20 copi 


0.0001 


20.0 


26 


1238- 
1269 


710 


Collagen 


Collagen triple helix repeat (20 copi 


4e-09 


36.5 


27 


1278- 
1337 
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710 


Collagen 


Collagen triple helix repeat (20 copi 


1.9e-13 


52.8 


28 


1341- 
1394 


710 


Collagen 


Collagen triple helix repeat (20 copi 


7.1e-06 


24.3 


29 


1401- 
1434 


710 


Collagen 


Collagen triple helix repeat (20 copi 


0.0012 


16.0 


30 


1435- 
1483 


710 


C4 


C-terminal tandem repeated domain in 


2e-69 


240.8 


1 


1489- ! 
1596 


710 


C4 


C-terminal tandem repeated domain in 


1.3e-77 


268.0 


2 


1597- 
1711 


711 


MGAT2 


N-acetylglucosaminyltransferase II (M 


0.36 


0.6 


1 


61-69 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


7.7e-15 


51.1 


1 


67-108 


711 


ldl recept a 


Low-density lipoprotein receptor doma 


4e-10 


35.6 


2 


112-152 


711 


DUF351 


Domain of Unknown Function 
(DUF351) 


0.25 


4.8 


1 


136-144 


711 


EGF 


EGF-like domain 


0.0001 1 


19.6 


1 


157-190 


711 


EGF 


EGF-like domain 


0.0004 


17.6 


2 


196-230 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


7.3e-10 


34.9 


1 


332-373 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


2.7e-07 


26.4 


2 


375-417 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


7.6e-08 


28.2 


3 


419-461 


711 


EGF 


EGF-like domain 


0.045 


10.2 


3 


512-553 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


8.3e-10 


34.7 


4 


605-646 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


8.4e-ll 


38.1 


5 


648-692 


711 


Idl_recept_b 


Low-density lipoprotein receptor repe 


1.8e-09 


33.6 


6 


694-742 


711 


ldl recept b 


Low-density lipoprotein receptor repe 


0.00039 


15.9 


7 - 


744-781 


711 


EGF 


EGF-like domain 


0.00036 


17.8 


4 


835-870 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


6.6e-17 


57.9 


3 


882-920 


711 


squash 


Squash family serine protease inhibit 


0.6 


2.5 


1 


892-908 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


5.8e-l5 


51.5 


4 


921-961 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


1.6e-15 


53.3 


5 


962- 
1001 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


2.1e-18 


62.8 


6 


1002- 
1041 


711 


DX 


DX module 


0.78 


3.2 


I 


1016- 
1047 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


8.9e-16 


54.2 


7 


1043- 
1081 ! 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


2.3e-14 


49.5 


8 


1088- 
1127 ; 


711 


Idl_recept_a 


Low-density lipoprotein receptor doma 


6.8e-ll 


38.1 


9 


1130- 
1170 


711 


ldi_recept_a 


Low-density lipoprotein receptor doma 


1.6e-06 


23.6 


10 


1173- 
1206 


711 


EGF 


EGF-like domain 


2.1e-07 


29.5 


7 


1213- 
1249 


711 


CBM_14 


Chitin binding Peritrophin-A domain 


0.1 


6.7 


1 


1235- 
1255 


711 


EGF 


EGF-like domain 


0.099 


9.0 


8 


1255- 
1289 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


4.6e-09 


32.3 


9 


1337- 
1382 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


6.3e-l5 


51.8 


10 


1384- 
1425 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


3.3e-ll 


39.4 


11 


1427- 
1472 
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711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


0.094 


8.0 


12 


1474- 
1515 


711 


ldlj*ecept_b 


Low-density lipoprotein receptor repe 


0.0042 


12.5 


13 


1517- 
1558 


711 


EGF 


EGF-like domain 


0.00016 


19.0 


9 


1568- 
1606 


711 


ldljrecept_b 


Low-density lipoprotein receptor repe 


1.6e-12 


43.8 


14 


1655- 
1696 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


0.003 


13.0 


15 


1698- 
1740 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


1.7e-07 


27.1 


16 


1742- 
1780 


711 


ldl__recept_b 


Low-density lipoprotein receptor repe 


0.003 


13.0 


17 


1782- 
1822 


711 


EGF 


EGF-like domain 


2.2e-06 


25.8 


10 


1875- 
1911 


711 


Keratin 


Keratin 


0.43 


1.6 


1 


1881- 
1894 


711 


DUF244 


Uncharacterized protein family (ORF7) 


0.77 


1.6 


1 


1934- 
1952 


711 


Idl__recept_b 


Low-density lipoprotein receptor repe 


7.6e-08 


28.2 


18 


1959- 
2000 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


2.7e-13 


46.3 


19 


2002- 
2043 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


3.1e-ll 


39.5 


20 


2045- 
2087 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


0.00065 


15.2 


21 


2089- 
2118 


711 


EGF 


EGF-like domain 


8.7e-06 


23.6 


11 


2184- 
2219 


711 


ldljrecept_b 


Low-density lipoprotein receptor repe 


0.49 


5.6 


22 


2318- 
2365 


711 


malic_N 


Malic enzyme, NAD binding domain 


0.26 


2.4 


1 


2340- 
2362 


711 


idl_recept_b 


Low-density lipoprotein receptor repe 


7.6e-14 


48.1 


23 


2367- 
2410 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


0.0026 


13.2 


24 


2412- 
2440 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


0.00025 


16.6 


25 


2453- 
2479 


711 


EGF 


EGF-like domain 


0.67 


6.0 


12 


2505- 
2528 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


1.5e-14 


50.2 


11 


2545- 
2586 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


6.2e-13 


44.8 


12 


2587- 
2625 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


1.4e-14 


50.2 


13 


2626- 
2664 


711 


ldlj-ecept_a 


Low-density lipoprotein receptor doma 


9.4e-ll 


37.6 


14 


2682- 
2713 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


7.3e-10 


34.7 


15 


2717- 
2753 


711 


ldlj*ecept_a 


Low-density lipoprotein receptor doma 


5.2e-ll 


38.5 


16 


2755- 
2795 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


1.8e-17 


59.8 


17 


2796- 
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2838 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


5.8e-14 


48.2 


18 


2840- 
2879 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


5.1e-ll ! 


38.5 


19 


2880- 
2923 


711 


ldi_recept_a 


Low-density lipoprotein receptor doma 


5.1e-12 


41.8 


20 


2926- 
2964 


711 


EGF 


EGF-Hke domain 


0.61 


6.1 j 


13 


2928- 
2962 


711 


dickkopf_N 


7/11 2849 2856.. 47 54 


0.32 


4.9 


8 


2935- 
2942 


711 


Omega-atracotox 


Omega-atracotoxin 


0.46 


3.7 


2 


2937- 
2957 


711 


EGF 


EGF-Hke domain 


3.9e-06 


24.9 


14 


2967- 
3003 


711 


TIL 


Trypsin Inhibitor like cysteine rich 


6.4e-05 


16.4 


2 


2987- 
3009 


711 


EGF 


EGF-like domain 


0.00094 


16.3 


15 


3009- 
3034 


711 


ldl recept b 


Low-density lipoprotein receptor repe 


8.1e-09 


31.5 


26 


3092- 
3134 


711 


Idl recept b 


Low-density lipoprotein receptor repe 


4.1e-07 


25.8 


27 


3136- 
3177 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


Lle-08 


31.0 


28 


3179- 
3221 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


0.078 


8.3 


29 


3223- 
3251 


711 


ldl recept b 


Low-density lipoprotein receptor repe 


0.0013 


14.2 


30 


3262- 
3289 


711 


EGF 


EGF-like domain 


1.6e-06 


26.3 j 


16 


3314- 
3350 


711 


TNFR_c6 


1/3 69 84.. 1 18 


0.42 


6.2 


2 


3337- 
3352 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


L9e-12 


43.2 


21 


3352- 
3391 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


1.4e-12 


43.7 


22 


3392- 
3430 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


3.9e-12 


42.2 


23 


3431- 
3470 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


3.5e-17 


58.8 


24 


3471- 
3510 


711 


SAPA 


Saposin A-type domain 


0.039 


6.0 


1 


3479- 
3492 


711 


Sar8 2 


Sar8.2 family 


0.12 


6.9 


1 


3480- 
3500 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


1.4e-19 


66.7 


25 


3511- 
3549 ' 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


8.3e-l3 


44.4 


26 


3550- 
3588 


711 


EGF 


EGF-like domain 


0.54 


6.3 


17 


3552- 
3586 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


1.3e-14 


50.4 


27 


3590- 
3626 


711 


dickkopf_N 


7/11 2849 2856.. 47 54 


0.057 


7.2 


10 


3596- 
3604 
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711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


6e-14 


48.1 


28 


3629- 
3666 


711 


Herpes_PAP 


Herpesvirus polymerase accessory prot 


0.41 


2.1 


1 


3637- 
3650 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


2e-19 


66.2 


29 


3667- 
3706 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


2.7e-il 


39.4 


30 


3709- 
3749 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


5.7e-07 


25.1 


31 


3758- 
3790 


711 


ldl_recept_a 


Low-density lipoprotein receptor doma 


5.3e-17 


58.2 


32 


3797- 
3835 


711 


EGF 


19/28 3669 3704 1 46 


0.00016 


19.1 


20 


3842- 
3879 


711 


SJocus_glycop 


S-locus glycoprotein family 


0.94 


5.0 


1 


3849- 
3870 


711 


TIL 


Trypsin Inhibitor like cysteine rich 


0.051 


7.3 


3 


3864- 
3885 


711 


lamininJEGF 


Laminin EGF-like (Domains III and V) 


0.23 


6.5 


2 


3865- 
3879 


711 


EGF 


19/28 3669 3704 1 46 


0.0054 


13.5 


21 


3885- 
3914 


711 


Activin_recp 


Activin types I and II receptor domai 


0.48 


3.4 


1 


3891- 
3921 


711 


NHL 


2/5 681 694.. 1 14 


0.06 


10.7 


4 


4005- 
4031 


711 


ldl_jecept_J> 


Low-density lipoprotein receptor repe 


0.14 


7.5 


31 


4008- 
4016 


711 


ldl_recept_b 


Low-density lipoprotein receptor repe 


0.11 


7.8 


32 


4018- 
4026 


711 


ldl_recept_b 


33/35 4040 4074 9 47 


1.9e-10 


36.9 


34 


4076- 
4118 


711 


ldl_recept_b 


33/35 4040 4074 .. 9 47 


0.019 


10.3 


35 


4120- 
4163 


711 


EGF 


19/28 3669 3704 1 46 


0.77 


5.8 


22 


4213- 
4236 


711 


EB 


EB module 


0.15 


6.2 


3 


4229- 
4244 


711 


EGF 


19/28 3669 3704 .. 1 46 


0.00038 


17.7 


23 


4254- 
4285 


711 


EGF 


19/28 3669 3704 .. 1 46 


8.8e-08 


30.8 


24 


4290- 
4321 


711 


EGF 


19/28 3669 3704 1 46 


7.4e-08 


31.1 


25 


4326- 
4357 


711 


EGF 


27/28 4398 4428 .. 1 46 


0.0014 


15.6 


28 


4431- 
4463 


711 


Coagulin 


Coagulin 


0.52 


3.4 


1 


4447- 
4454 


711 


Herpes_gIycop 
D 


Herpesvirus glycoprotein D 


0.39 


4.3 


2 


4483- 
4519 


712 


MGAT2 


N-acetylglucosaminyltransferase II (M 


0.36 


0.6 


1 


61-69 


712 


ldl_recept_a 


Low-density lipoprotein receptor doma 


le-14 


50.7 


1 


67-108 


712 


ldl recept a 


Low-density lipoprotein receptor doma 


4e-10 


35.6 


2 


112-152 


712 


DUF351 


Domain of Unknown Function 
(DUF351) 


0.25 


4.8 


1 


136-144 



WO 2004/080148 



PCT/US2003/030720 



496 
TABLE 4B 



SEQ 
ID 


Model 


Description 


E_value 


Score 


Repeats 


Position 


712 


EGF 


EGF-like domain 


0.072 


9.5 


2 


1 cn i o i 
lb /-lol 


714 


cadherin 


Cadherin domain 


0.085 


8.0 


1 


An <c. 


714 


cadherin 


Cadherin domain 


0.00072 


15.2 


2 


oy-izo 


714 


cadherin 


Cadherin domain 


8.4e-17 


60.3 


3 


1 A(\ O/l 1 

140-241 


714 


cadherin 


Cadherin domain 


L4e-29 


104.9 


4 


ore 1AA 

25j-j44 


714 


cadherin 


Cadherin domain 


7.9e-25 


88.3 


5 


363-466 


714 


cadherin 


Cadherin domain 


2e-26 


93.9 


6 


480-573 


714 


cadherin 


Cadherin domain 


3.2e-28 


100.1 


7 


588-680 


714 1 


Rad21 Rec8 


Conserved region of Rad21 / Rec8 like 


0.83 


5.2 


1 


652-662 


714 


cadherin 


Cadherin domain 


3.9e-28 


99.8 


8 


694-784 


714 


SCPU 


Spore Coat Protein U domain 


0.47 


5.3 


1 


701-714 


714 


cadherin 


Cadherin domain 


5.7e-20 


71.3 


9 


798-884 


714 


cadherin 


Cadherin domain 


7.6e-20 


70.9 


10 


898-987 


714 


cadherin 


Cadherin domain 


9.5e-28 


98.5 


11 


1001- 
1091 


714 


cadherin 


Cadherin domain 


5.1e-16 ' 


57.6 


12 


1105- 
1201 


714 


cadherin 


Cadherin domain 


1.4e-28 


101.4 


13 


1215- 
1306 


714 


Propep_M14 


Carboxypeptidase activation peptide 


0.41 


5.5 


2 


1228- 
1239 


714 


cadherin 


Cadherin domain 


2.2e-29 


104.2 


14 


1320- 
1411 


714 


cadherin 


Cadherin domain 


7.2e-21 


74.5 


15 


1425- 

i tin 

1520 


714 


Baculojielicase 


Baculovirus DNA helicase 


0.61 


1.4 


1 


1521- 

i a i 

1531 


714 


cadherin 


Cadherin domain 


4.5e-16 


57.7 


16 


1 CA 1 

1541- 

1 /COO 

lozz 


714 


cadherin 


Cadherin domain 


0.00017 


17.4 


17 


1 CIA 

1034- 
l /uu 


715 


cadherin 


Cadherin domain 


0.085 


5.0 


i 


4 /-Oj 


715 


cadherin 


Cadherin domain 


0.00072 


ICO 

lj.Z 


O 

z 


Oi/- 1ZO 


715 


cadherin 


Cadherin domain 


a a — 1 o 

8.4e-17 


60.3 


1 
5 


14U-Z4 1 


715 


cadherin 


Cadherin domain 


i a * on 

1 .4e-z9 


IU4.y 


A 

4 


9« '1AA 


715 


cadherin 


Cadherin domain 


6.1e-25 


88.7 


5 


363-466 


715 


cadherin 


Cadherin domain 


2e-26 


93.9 


0 


4oUO / 3 


715 


cadherin 


Cadherin domain 


3.2e-28 


100.1 


7 


588-680 


715 


Rad21 Rec8 


Conserved region of Rad21 / Rec8 like 


0.83 


5.2 


i 


ODZ-OOZ 


715 


cadherin 


Cadherin domain 


3.9e-28 


99.8 


8 


694-784 


715 


SCPU 


Spore Coat Protein U domain 


0.47 


5.3 


1 


701-714 


715 


cadherin 


Cadherin domain 


5.7e-20 


71.3 


9 


798-884 


715 


cadherin 


Cadherin domain 


7.6e-20 


70.9 


10 


898-987 


715 


cadherin 


Cadherin domain 


9.5e-28 


98.5 


11 


1001- 
1091 


715 


cadherin 


Cadherin domain 


C 1 a. 1 a 


J /,0 


iZ 


1 L\JJ~ 

1201 


715 


cadherin 


Cadherin domain 


1.4e-28 


101.4 


13 


1215- 
1306 


715 


Propep_M14 


Carboxypeptidase activation peptide 


0.41 


5.5 


2 


1228- 
1239 


715 


cadherin 


Cadherin domain 


2.2e-29 


104.2 


14 


1320- 
1411 


715 


cadherin 


Cadherin domain 


7.2e-21 


74.5 


15 


1425- 
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1520 


715 


Baculo_helicase 


Baculovirus DNA helicase 


0.61 


1.4 


1 


1521- 

i53i ; 


715 


cadherin 


Cadherin domain 


4.5e-16 


57.7 


16 


1541- 
1622 


715 


cadherin 


Cadherin domain 


3.5e-05 


19.8 


17 


1634- 
1728 


716 


DPPIV N term 


Dipeptidyl peptidase IV (DPP IV) N-te 


0.5 


1.1 


1 


310-346 


716 


DPPIV N term 


Dipeptidyl peptidase IV (DPP IV) N-te 


0.0014 


8.6 




516-589 


716 


DPPIV N term 


Dipeptidyl peptidase IV (DPP IV) N-te 


5.3e-08 


21.7 




618-652 


716 


Peptidase_S9 


Prolyl oligopeptidase family 


3.9e-ll 


36.8 


1 


664-736 


716 


Methyltransf_6 


Demethylmenaquinone 
methyltransferase 


0.54 


3.9 


1 


675-688 


716 


Esterase 


Putative esterase 


0.062 


6.6 




710-753 


717 


zf-C2H2 


Zinc finger, C2H2 type 


0.015 


14.8 


l ! 


32-54 


717 


zf-C2H2 


Zinc finger, C2H2 type 


0.0014 


18.9 




60-82 


717 


Apocytochr_F_C 


Apocytochrome F, C-terminal 


1 


3.2 


1 


103-110 


717 


TFIIS 


Transcription factor S-II (TFIIS) 


0.2 


7.1 




154-164 


717 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-08 


37.3 


3 


154-176 


717 


XPAJNT 


XPA protein N-terminal 


0.3 


6.5 


2 


179-191 


717 


zf-C2H2 


Zinc finger, C2H2 type 


8.5e-06 


27.9 


4 


182-204 


717 


zf-C2H2 


Zinc finger, C2H2 type 


6.4e-08 


36.5 


5 


210-232 


717 


TFIIS 


3/8 210 220.. 29 39 


1 


4.7 


4 


238-248 


717 


zf-C2H2 


Zinc finger, C2H2 type 


1.6e-06 


30.8 


6 


238-260 


717 


XPA N 


XPA protein N-terminal 


1 


4.6 


4 


263-275 


717 


zf-C2H2 


Zinc finger, C2H2 type 


1.4e-05 


27.0 


7 


266-288 


717 


zf-C2H2 


Zinc finger, C2H2 type 


2.6e-05 


25.9 


8 


294-316 


717 


TFIIS 


5/8 266 276.. 29 39 


0.2 


7.1 


7 


322-332 


717 


zf-C2H2 


Zinc finger, C2H2 type 


6.9e-06 


28,3 


9 


322-344 


717 


XPA N 


XPA protein N-terminal 


0.38 


6.2 


6 


347-359 


717 


TFIIS 


5/8 266 276.. 29 39 


0.14 


7.7 


8 ! 


350-360 


717 


zf-C2H2 


Zinc finger, C2H2 type 


le-07 


35.7 


10 


350-372 


719 


Phytoreo_Pns 


Phytoreovirus nonstructural protein P 


0.75 


2.1 


1 


74-88 


719 


malic 


Malic enzyme, N-terminal domain 


0.39 


3.5 


1 


117-131 


719 


AlpA 


Prophage CP4-57 regulatory protein (A 


0.95 


4.3 


1 


258-266 


719 


DUF298 


Domain of unknown function 
(DUF298) 


0.42 


5.1 


1 


308-337 


719 


DUF827 


Plant protein of unknown function (DU 


0.029 


7.3 


1 


363-387 


719 


DUF496 


Protein of unknown function (DUF496) 


0.49 


5.1 


1 


389-409 


719 


K-box 


K-box region 


0.37 


5.2 


1 


392-406 


719 


TFIIE alpha 


TFIIE alpha subunit 


0.14 


5.9 


1 


394-416 


719 


Mlp 


Mlp lipoprotein family 


0.95 


2.4 


1 


398-451 


719 


Ribosomai_S20p 


Ribosomal protein S20 


0.38 


5.2 


1 


433-447 


719 


Phage_B 


Scaffold protein B 


0.47 


1.7 


1 


504-518 


720 


ig 


Immunoglobulin domain 


0.07 


9.9 


1 


17-34 


720 


ig 


Immunoglobulin domain 


5.1e-ll 


44.1 


2 


co no 
00- I/O 


720 


ig 


Immunoglobulin domain 


l.le-11 


46.7 


3 


163-223 


720 


ig 


Immunoglobulin domain 


9.6e-07 


28.1 


4 


259-317 


720 


AstA 


Arginine N-succinyltransferase beta s 


0.92 


2.5 


1 


294-305 


720 


ig 


Immunoglobulin domain 


2.1e-09 


38.1 


5 


352-410 


720 


i? 


Immunoglobulin domain 


l.Se-10 


42.3 


6 


445-503 


720 


RTC 


RNA 3 , -terminaI phosphate cyclase 


0.7 


3.3 


1 


474-491 


720 


ig 


Immunoglobulin domain 


8.1e-08 


32.1 


7 


538-596 


720 


ig 


Immunoglobulin domain 


1.3e-07 


31.3 


8 


629-687 
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720 




Immunoglobulin domain 


8.1C-09 


35.9 


9 


720-780 


720 




Immunoglobulin domain 


3.7e-09 


37.2 


10 


813-871 


720 


ig 


Immunoglobulin domain 


7.4e-09 


36.0 


11 


904-962 


720 


ig 


Immunoglobulin domain 


1.9e-ll 


45.7 


12 


995- 
1052 


720 


ig 


Immunoglobulin domain 


1.3e-07 


31.4 


13 


1085- 
1143 


720 


ig 


Immunoglobulin domain 


l.3e-ll 


46.4 


14 


1176- 
1232 


720 


ig 


Immunoglobulin domain 


3.6e-10 


40.9 


15 


1266- 
1323 


720 


Marek_A 


Marelds disease glycoprotein A 


0.84 


1.1 


I 


1333- 
1356 


720 


RNA_pol Rpb2 
1 


RNA polymerase beta subunit 


0.35 


1.6 


1 


1352- 
1864 


720 


ig 


Immunoglobulin domain 


6.4e-10 


40.0 


16 


1356- 
1413 


720 


tsp_l 


Thrombospondin type 1 domain 


1.2e-19 


67.2 


1 


1435- 
1485 


720 


tsp_l 


Thrombospondin type 1 domain 


6.4e-17 


58.1 


2 


1492- 
1542 


720 


tspj 


Thrombospondin type 1 domain 


3.5e-15 


52.3 


3 


1549- 
1599 


720 


tsp_l 


Thrombospondin type 1 domain 


2.2e-17 


59.7 


4 


1606- 
1656 


720 


tsp_l 


Thrombospondin type 1 domain 


8.2e-12 


41.1 


5 


1663- 
1713 


720 


VOMI 


Vitelline membrane outer layer protei 


0.37 


3.6 


1 


1714- 
1728 


720 


tspj 


Thrombospondin type 1 domain 


7e-16 


54.7 


6 


1720- 
1770 


720 


EGF 


EGF-like domain 


0.95 


5.4 


1. 


1993- 
2007 


720 


EGF 


EGF-like domain 


9.3e-08 


30.7 


2 


2013- 
2047 


720 


granulin 


Granulin 


0.44 


4.7 


1 


2034- 
2049 


720 


EGF 


EGF-like domain 


0.015 


11.9 


3 


2053- 
2092 


720 


EGF 


EGF-like domain 


2.8e-05 


2L8 


4 


2098- 
2130 ! 


720 


TIL 


1/7 1698 1715.. 1 16 


0.0012 


12.5 


3 


2117- 
2136 


720 


EGF 


EGF-like domain 


0.17 


8.2 


5 


2136- 
2157 


720 


EGF 


EGF-like domain 


2.4e-06 


25.7 


6 


2178- 
2215 


720 


EGF 


EGF-like domain 


5.7e-10 


38.7 


7 


2221- 
2256 


720 


Ribosomal_L34 


Ribosomal protein L34 


0.33 


5.5 


1 


2280- 
2323 


720 


TIL 


4/7 2168 2178.. 57 68 


0.022 


8.5 


6 


2320- 
2338 


720 


EGF 


EGF-like domain 


l,9e-09 


36.8 


8 


2338- 
2372 
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720 


toxin_J2 


Scorpion short toxin 


0.84 


3.4 


2 


2338- 
2353 


720 


toxin_5 


Scorpion short toxin 


0.73 


3.4 


1 


2354- 
2359 


720 


TIL 


4/7 2168 2178.. 57 68 


0.023 


8.4 


7 


2357- 
2378 


720 


squash 


Squash family serine protease inhibit 


0.44 


2.8 


2 


2358- 
2386 


720 


fn2 


Fibronectin type II domain 


0.8 


3.1 


1 


2407- 
2418 


721 


SAP 


SAP domain 


2.4e-10 


40.8 


1 


3-37 


721 


SPRY 


SPRY domain 


l.8e-30 


107.5 


1 


289-418 


721 


SRP54 


SRP54-type protein, GTPase domain 


0.0091 


ll.6 


1 


451-466 


721 


NACHT 


NACHT domain 


0.18 


5.5 


1 


453-469 


721 


SKI 


Shikimate kinase 


0.33 


4.9 


1 


453-466 


721 


Zot 


Zonular occludens toxin (Zot) 


0.22 


5.5 


1 


453-466 


721 


AAA 


ATPase family associated with various 


0.098 


5.8 


1 


454-466 


721 


tRNA_synt_lc_R 
2 


GIutaminyl-tRNA synthetase, non- 
speci 


0.79 


3.9 


1 


580-616 


722 


CheB_methylest 


CheB methylesterase 


I 


2.7 


1 


74-92 


722 


DUF258 


Protein of unknown function, DUF258 


0.0014 


13.8 


1 


509-532 


722 


ABC tran 


ABC transporter 


7.4e-59 


198.4 


1 


510-692 


722 


NACHT 


NACHT domain 


0.2 


5.3 


1 


511-527 


722 


SMCN 


RecF/RecN/SMC N terminal domain 


0.47 


3.9 


1 


511-524 


722 


Zot 


Zonular occludens toxin (Zot) 


0.28 


5.1 


1 


511-524 


722 


RHD3 


Root hair defective 3 GTP-binding pro 


0.67 


1.2 


1 


516-530 


722 


Pox D2 


Pox virus D2 protein 


0.86 


1.4 


1 


604-617 


722 


tail comp S 


Phage virion morphogenesis family 


0.061 


7.3 


1 


606-619 


722 


DUF333 


Domain of unknown function 
(DUF333) 


0.3 


5.7 


1 


818-846 


722 


ABCjran 


ABC transporter 


l.le-47 


160.9 


2 


1322- 
1506 


722 


SufE 


Fe-S metabolism associated domain 


0,28 


6.2 


1 


1544- 
1563 


723 


BEX 


Brain expressed X-linked like family 


0,88 


2.2 


1 


133-160 


723 


CytoCJtC 


Photosynthetic reaction centre cytoch 


1 


1.4 


1 


215-231 


723 


Ski_Sno 


SKI/SNO/DAC family 


0.51 


4.5 


1 


656-672 


724 


HpaB 


4-hydroxyphenylacetate 3-hydroxylase 


0.97 


2.5 


1 


4-14 


724 


Acyl-CoA_dh 


Acyl-CoA dehydrogenase, C-terminal 
do 


6.7e-50 


175.9 


1 


50-201 


725 


C_tripleX 


Cysteine rich repeat 


2e-05 


17.8 


1 


59-76 


725 


Bowman- 
Birkleg 


Bowman-Birk serine protease inhibitor 


1 


4.0 


1 


68-83 


725 


laminin EGF 


Laminin EGF-hke (Domains III and V) 


0.32 


6.1 


1 


80-93 


725 


EGF 


EGF-like domain 


8.7e-06 


23.6 


2 


98-126 


725 


TIL 


Trypsin Inhibitor like cysteine rich 


0.0035 


ll.O 


1 


117-138 


725 


EGF 


EGF-like domain 


7.5e-05 


20.2 


3 


138-172 


725 


TIL 


Trypsin Inhibitor like cysteine rich 


0.26 


5.1 


2 


151-178 


725 


toxin 5 


Scorpion short toxin 


0.34 


4.4 


1 


153-158 


725 


EGF 


EGF-like domain 


4.4e-05 


21. 1 


4 


178-211 


725 


EGF 


EGF-like domain 


9.7e-09 


34.3 


5 


223-258 


725 


MAM 


MAM domain 


3.5e-4l 


147.0 


1 


402-546 


726 


DUF626 


Protein of unknown function (DUF626) 


0.22 


5.8 


1 


30-64 


726 


VSP 


Giardia variant-specific surface prot 


I 


1.8 


1 


106-131 
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726 


zf-B box 


B-box zinc finger 


c a« ao 

5.9e-Uo 


32. / 


1 


iA/r no 


726 


Prefoldin 


Prefoldin subunit 


0.42 


5.7 


1 


OOO O/fO 

222-248 


726 


Filamin 


Filamin/ABP280 repeat 


2.3e-21 


74.7 


1 


O 1 O A AO 

313-402 


726 


NHL 


NHL repeat 


A A _ 1 A 

4.4e-10 


Af\ O 

40.3 


1 


i-5 1 yi CO 

431-458 


726 


Glyoxalase 


Glyoxalase/Bleomycin resistance prote 


0.78 


o /z 

3.6 




A1H CC\A 

476-504 


726 


NHL 


NHL repeat 


2.4e-10 


41.2 


T 

2 


478-505 


726 


NHL 


NHL repeat 


l.le-10 




o 

3 


coc ceo 

525-552 


726 


NHL 


NHL repeat 


2.5e-09 


OT c 

37.6 


A 

4 


CJO CAA 

572-599 


726 


NHL 


NHL repeat 


7.8e-ll 


43.0 


c 

5 


/<a IZ A£. 

619-646 


726 


NHL 


NHL repeat 


3.8e-08 


OO O 

33.2 


a 
0 


000-693 


727 


PaREPl 


Archaeal PaREPl protein 


0.38 


5.3 




111-127 


727 


FCH 


Fes/CIP4 homology domain 


0.026 


10.3 


1 


281-321 


727 


DAG PE-bind 


Phorbol esters/diacylglycerol binding 


2.8e-05 


21.7 


1 


709-747 


727 


RhoGAP 


RhoGAP domain 


3.9e-68 


231.7 


1 


775-947 


727 


Terpene_synth_C 


Terpene synthase family, metal bindin 


0.84 


2.7 


1 


T7() O 1 O 

778-812 


727 


NnrS 


NnrS protein 


1 


1.8 


1 


934-943 


728 


DUF727 


Protein of unknown function (DUF727) 


0.83 


4.2 


1 


115-129 


728 


'CNJiydrolase 


Carbon-nitrogen hydrolase 


4e-09 


33.8 




1 OA O 1 Zf 

120-216 


729 


PepJVH2B_prop 
ep 


Reproiysin family propeptide 


3.3e-14 


44.8 


1 


93-223 


. , 
729 


Reproiysin 


Reproiysin (M12B) family zinc metallo 


f\ AAAOT 

0.00037 


16. 1 


1 


OOM OA/T 


729 


PsaL 


Photosystem I reaction centre subunit 


0.99 


o o 

3.2 


— 


i ao on 
302-3 1 / 


729 


Reproiysin 


Reproiysin (M12B) family zinc metallo 


8.5e-17 


/ro c 

62.5 






729 


Fragilysin 


Fragilysin metallopeptidase (M10C) en 


0.46 


3.1 


l 


412-430 


729 


dickkopf N 


Dickkopf N-terminal cysteine-rich reg 


0.0036 


1 A O 

10.8 




CO/f C/TA 

534-5 00 


729 


Stigl 


Stigma-specific protein, Stigl 


0.11 


4.5 


l 


544-558 


729 


EB 


EB module 


0.8 


3.9 




C A £ CCO 

546-558 


729 


tsp 1 


Thrombospondin type 1 domain 


7.1e-09 


31.3 


1 - 


570-623 


729 


zf-A20 


A20-like zinc ringer 


0.39 


8.6 




«A1 Tin 

702-717 


729 


ADAM spacer 1 


ADAM-TS Spacer 1 


3.8e-49 


173.5 


n 


734-852 


729 


Herpes^VP19C 


Herpesvirus capsid shell protein VP 19 


0.95 


3.6 




860-871 


729 


tsp_l 


2/12 866 875.. 4 13 


0.048 


8.5 


3 


985- 
1002 


729 


tsp_l 


2/12 866 875.. 4 13 


0.067 


8.1 


4 


1037- 
1089 


729 


tsp__l 


2/12 866 875.. 4 13 


1.2e-05 


20.6 


5 


1092- 
1115 


729 


PTN_MK_N 


PTN/MK heparin-binding protein 
family 


0.44 


4.2 


1 


1165- 
1184 


729 


tsp_l 


2/12 866 875.. 4 13 


7.6e-07 


24.5 


6 


1165- 
1190 


729 


tsp_l 


2/12 866 875.. 4 13 


1.4e-06 


23.7 


7 


1228- 
1276 


729 


tsp_l 


2/12 866 875.. 4 13 


4.6e-07 


25.3 


8 


1313- 
1364 


729 




2/12 866 875 .. 4 13 


0.00029 


15.9 


9 


1372- 
1420 


729 


tsp_l 


2/12 866 875.. 4 13 


1.7e-07 


26.7 


10 


1426- 
1479 


729 


tsp_l 


2/12 866 875.. 4 13 


4.7e-05 


18.6 


11 


1485- 
1506 


729 


tsp_i 


2/12 866 875.. 4 13 


0.00073 


14.6 


12 


1543- 
1593 


730 


Adeno Penton B 


Adenovirus penton base protein 


0.39 


1.6 


1 178-193 
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731 


ig 


Immunoglobulin domain 


0.19 


8.3 


1 


6-99 


731 


DUF390 


Protem of unknown function (DUF390) 


0.73 


0.9 


1 


83-95 


731 


ig 


Immunoglobulin domain 


9.7e-05 


20.6 


2 


146-235 


731 


ig 


Immunoglobulin domain 


0.00014 


20.0 


3 


282-373 


732 


ig 


Immunoglobulin domain 


0.0045 


14.4 


1 


42-129 


732 


ig 


Immunoglobulin domain 


0.19 


8.3 


2 


179-272 


732 


DUF390 


Protein of unknown function (DUF390) 


0.73 


0.9 


I 


256-268 


732 


ig 


Immunoglobulin domain 


9.7e-05 


20.6 


3 


319-408 


732 


ig 


Immunoglobulin domain 


0.00014 


20.0 


4 


455-546 


733 


ig 


Immunoglobulin domain 


0.0045 


14.4 


1 


42-129 


734 


ig 


Immunoglobulin domain 


0.0018 


15.8 


1 


42-126 


734 


DUF390 


Protein of unknown function (DUF390) 


0.73 


0.9 


1 


110-122 


735 


RhoGEF 


RhoGEF domain 


8.2e-08 


27.0 


1 


165-225 


735 


FA_hydroxylase 


Fatty acid hydroxylase 


0.6 


3.7 


1 


221-233 


735 


RhoGEF 


RhoGEF domain 


7.5e-09 


30.5 


2 


257-329 


736 


HEM4 


Uroporphyrinogen-III synthase HemD 


0.98 


3.1 


1 


549-581 


736 


DUF178 


Uncharacterized ACR, COG1427 


0.11 


6.0 


1 


604-622 


737 


rnn 


RNA recognition motif. (a.k.a. RRM, R 


2.5e-07 


28.2 


1 


78-142 


737 


Smg4JJPF3 


Smg-4/UPF3 family 


0.042 


8.7 


1 


143-173 


737 


rnn 


RNA recognition motif. (a.k.a. RRM, R 


9.7e-16 


58.1 


2 


151-222 


737 


fer4 NifH 


4Fe-4S iron sulfur cluster binding pr 


1 


2.4 


1 


160-176 


737 


rrm 


RNA recognition motif. (a.k.a. RRM, R 


3.6e-06 


24.1 


3 


274-311 


738 


Adeno_E4_34 


Adenovirus early E4 34 kDa protein co 


0.45 


4.4 


1 


5-22 


739 


ribonuc red sm 


Ribonucleotide reductase, small chain 


0.29 


3.7 


1 


244-265 


740 


Sua5_yciO_yrdC 


yrdC domain 


0.99 


3.3 


I 


38-53 


740 


F-box 


F-box domain 


0.095 


9.0 


1 


134-175 


740 


DUF469 


Protein with unknown function 
(DUF469 


0.38 


4.7 


1 


354-371 


741 


OmpH 


Outer membrane protein (OmpH-like) 


0.14 


6.9 


1 


81-150 


741 


Herpes_BLRF2 


Herpesvirus BLRF2 protein 


0.12 


7.3 


1 


256-277 


741 


UIM 


Ubiquitin interaction motif 


0.34 


8.8 


1 


293-310 


741 


DUF260 


Protein of unknown function DUF260 


0.26 


4.8 


1 


330-350 


741 


TelA 


Toxic anion resistance protein (TelA) 


0.34 


4.5 


1 


348-368 


741 


Pox A type inc 


1/5 216 235 .. 1 23 


0.6 


6.3 


2 


358-377 


741 


PspA_IM30 


PspA/IM30 family 


0.34 


5.2 


1 


364-399 


741 


M 


1/5 272 292 .. 1 21 


0.46 


8.0 


3 


534-554 


741 


Coprinusmating 


Coprinus cinereus mating-type protein 


0.65 


1,6 


1 


698-729 


741 


Ribosomal L29e 


Ribosomal L29e protein family 


0.3 


5.8 


1 


717-755 


741 


Phage_portal_2 


Phage portal protein, lambda family 


0.75 


2.2 


1 


799-816 


741 


Dishevelled 


Dishevelled specific domain 


0.22 


4.9 


1 


903-922 


741 


SlyX 


SlyX 


0.69 


1.3 


1 


945-954 


742 


cadherin 


Cadherin domain 


0.13 


7.4 


1 


30-96 


742 


cadherin 


Cadherin domain 


8.4e-13 


46.4 


2 


147-243 


742 


cadherin 


Cadherin domain 


7.1e-25 


88.5 


3 


257-349 


742 


HeJ>IG 


Putative Ig domain 


0.4 


5.5 


1 


262-279 


742 


cadherin 


Cadherin domain 


0.049 


8.8 


4 


369-399 


742 


cadherin 


Cadherin domain 


2.3e-05 


20.4 


5 


427-460 


742 


cadherin 


Cadherin domain 


5.6e-21 


74.9 


6 


474-563 


742 


cadherin 


Cadherin domain 


1.9e-25 


90.4 


7 


577-666 


742 


cadherin 


Cadherin domain 


4.5e-09 


33.4 


8 


693-737 


743 


PGM_PMMJ 


Phosphoglucomutase/phosphomannom 
utase 


1.6e-15 


57.2 


I 


1-47 


743 


PGM_PMM 


Phosphoglucomutase/phosphomannom 
utase 


0.041 


9.3 


1 


388-430 
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743 


Corl 


Corl/Xlr/Xmr conserved region 


0.73 


A 1 

4.1 


1 


AK 

4ZJ-4JJ 


744 


MACPF 


MAC/Perforin domain 


0.00017 


■tec 

15.5 


i 


1*2Q 17A 


744 


Keratin matx 


Keratin, high-sulphur matrix protein 


0.19 


7.6 


-J 


/K1-4R9 


744 


Noli Nop2_Sun 


NOLI /NOP2/sun family 


0.29 


A 1 

4.1 





£09 £79 

ouz-ozz 


745 


Remorin_C 


Remorin, C-terminal region 


0.19 


6.6 


J 


infl \11 


745 


zf-C2H2 


Zinc finger, C2H2 type 


0.00033 


21,5 





nn 1 59 
L5U-1 DZ 


745 


TFIIS 


Transcription factor S-II (TFIIS) 


0.3 


6.5 




I J 0*100 


745 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-07 ! 


33.1 


2 


1 CO 1 8A 
15o-loU 


745 


XPA N 


XPA protein N-terminal 


0.72 


5.2 


3 


1 Ql 10^ 

1 oj- 1 yJ 


745 


TFIIS 


Transcription factor S-II (TFIIS) 


0.069 


8.7 


I 


1 S£ 1 Q£ 


745 


zf-C2H2 


Zinc fmger, C2H2 type 


3.9e-07 


33.3 


3 


1 Q6. 90R 
loO-ZUO 


745 


XPA N 


XPA protein N-terminal 


0.21 


7.1 


4 


91 1 991 


745 


zf-C2H2 


Zinc finger, C2H2 type 


9.4e-08 


35.8 


4 


Z14-ZJ0 


745 


zf-C2H2 


Zinc fmger, C2H2 type 


4.8e-07 


32.9 


5 


1/10 

Z4Z-Z0H 


745 


XPA N 


XPA protein N-terminal 


0.13 


T O 

7.8 


o 


zo/-z/y 


745 


TFIIS 


Transcription factor S-II (TFIIS) 


0.28 


6.6 


5 


77A 7CA 

z/u-zou 


745 


zf-C2H2 


Zinc finger, C2H2 type 1 


3.1e-07 


33.7 


o 


77A 7Q7 

z /u-zyz 


745 


XPA N 


XPA protein N-terminal 


0.13 


—t a 
7.0 


/ 


70S ^07 


745 


TFIIS 


Transcription factor S-II (TFIIS) 


0.54 


5.0 


o 


7QR ^OR 


745 


zf-C2H2 


Zinc finger, C2H2 type 


3.1e-06 


29 J 


n 

1 


7QR ^7A 


745 


XPA N 


XPA protein N-terminal 


0.74 


< 1 

5.2 


Q 

o 


^7^-^^S 
jZJ-OJJ 


745 


TFIIS 


Transcription factor S-II (TFIIS) 


0.073 


0.0 


n 
I 


^7^-^fi 


745 


zf-C2H2 


Zinc finger, C2H2 type 


3.3e-07 


11 c 

33.0 


Q 
O 


^76-^zlR 


745 


XPA N 


XPA protein N-terminal 


0.72 


5.Z 




j«J l'JUJ 


745 


TFIIS 


Transcription factor S-II (TFIIS) 


0.51 


5.7 


Q 
0 




745 


zf-C2H2 


Zinc finger, C2H2 type 


7.5e-07 


32.2 


o 
y 


07^ 


745 


XPA N 


XPA protein N-terminal 


0.13 


7.0 


1 A 


^70-^01 
J ly-jy i 


745 


TFIIS 


Transcription factor S-II (TFIIS) 


0.28 


6.6 


Q 

y 


jOZ"J2/^ 


745 


zf-C2H2 


Zinc finger, C2H2 type 


4.4e-06 


29.1 


1 A 


^87-404 
jOZ-HV/t 


745 


XPA N 


XPA protein N-terminal 


0.13 


7.0 


1 1 
1 1 


dA7-410 
hu / i y 


745 


TFIIS 


Transcription factor S-II (TFIIS) 


0.28 


/C C 

o.o 


1 A 




745 


zf-C2H2 


Zinc finger, C2H2 type 


2.7e-07 


33.9 


1 1 
1 1 




745 


zf-C2H2 


Zinc finger, C2H2 type 


0.0011 


in a 
19.4 


19 
LZ 


440-460 


745 


XPA N 


XPA protein N-terminal 


0.67 


C 1 

5.3 


1 7 


4RS-407 


745 


zf-C2H2 


13/16 466 481.. 1 17 


3.9e-06 


29.3 


14 


488-510 


745 


TFIIS 


Transcription factor S-II (TFIIS) 


0.0051 


12,6 


1 1 

12 


J 1 J-JZO 


745 


zf-C2H2 


13/16 466 481.. 1 17 


1.3e-05 


27.2 


15 


j 10-jjO 


745 


zf-BED 


BED zinc finger 


0.71 


4.6 


i 
3 


D 1 /•JJ7 


745 


XPA N 


XPA protein N-terminal 


0.092 


8.3 


1 A 

14 




745 


TFIIS 


Transcription factor S-II (TFIIS) 


0.28 


6.6 


i a 
13 


^44-^^4 


745 


zf-C2H2 


13/16 466 481.. 1 17 


0.00057 


20.5 


lo 


J 44- J OJ 


746 


KRAB 


KRAB box 


6.9e-24 


88.6 


1 


"l^ 7^ 


746 


ROS_MUCR 


ROS/MUCR transcriptional regulator 
Dr 


0.33 


i r\ 
3.9 


1 


Rl 104 


746 


Remorin C 


Remorin, C-terminal region 


0.19 


6.6 


1 

1 


10S 90R 


746 


zf-C2H2 


Zinc linger, czhz type 


0 00033 


21.5 


1 


205-227 


746 


TFIIS 


Transcription factor S-II (TFIIS) 


0.3 


6.5 


1 


233-243 


746 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-07 


33.1 


2 


233-255 


746 


XPA N 


XPA protein N-terminal 


0.72 


5.2 


3 


258-270 


746 


TFIIS 


Transcription factor S-II (TFIIS) 


0.069 


8.7 


2 


261-271 


746 


zf-C2H2 


Zinc finger, C2H2 type 


3.9e-07 


33.3 


3 


261-283 


746 


XPA N 


XPA protein N-terminal 


0.21 


7.1 


4 


286-298 


746 


zf-C2H2 


Zinc finger, C2H2 type 


9.4e-08 


35.8 


4 


289-311 


746 


zf-C2H2 


Zinc finger, C2H2 type 


4.8e-07 


32.9 


5 | 317-339 
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746 


XPA N 


XPA protein N-terminal 


0.13 


7.8 


6 


342-334 


746 


TFIIS 


Transcription factor S-II (TFIIS) 


0.28 


6.6 


5 


<■> A C ICC 

345-355 


746 


zf-C2H2 


Zinc finger, C2H2 type 


3.1e-07 


33.7 


6 


345-367 


746 


XPA N 


XPA protein N-terminal 


0.13 


7.8 


7 


3/0-30/ 


746 


TFIIS 


Transcription factor S-II (TFIIS) 


0.54 


5.6 


6 


373-383 


746 


zf-C2H2 


Zinc finger, C2H2 type 


3.1e-06 


29.7 


7 


373-395 


746 


XPA N 


XPA protein N-terminal 


0.74 


5.2 


8 


398-410 


746 


TFIIS 


Transcription factor S-II (TFIIS) 


0.073 


8.6 


7 


400-411 


746 


zf-C2H2 


Zinc finger, C2H2 type 


3.3e-07 


33.6 


8 


401-423 


746 


XPA N 


XPA protein N-terminal 


0.72 


5.2 


9 


426-438 


746 


TFIIS 


Transcription factor S-II (TFIIS) 


0.51 


5.7 


8 


A OA A 1 1\ 

429-439 


746 


zf-C2H2 


Zinc finger, C2H2 type 


7.5e-07 


32.2 


9 


429-451 


746 


XPA N 


XPA protein N-terminal 


0.13 


7.8 


10 


454-466 


746 


TFIIS 


Transcription factor S-II (TFIIS) 


0.28 


6.6 


9 


457-467 


746 


zf-C2H2 


Zinc finger, C2H2 type 


4.4e-06 


29.1 


10 


457-479 


746 


XPA_N 


XPA protein N-terminal 


0.13 


7.8 


11 


482-494 


746 


TFIIS 


Transcription factor S-II (TFIIS) 


0.28 


6.6 


10 


485-495 


746 


zf-C2H2 


Zinc finger, C2H2 type 


2.7e-07 


33.9 


11 


485-507 


746 


zf-C2H2 


Zinc finger, C2H2 type 


0.0011 


19.4 


12 


515-535 


747 


EMP24 GP25L 


emp24/gp25L/p24 family 


4.9e-80 


276.1 


1 


5-201 


748 


acidj)hosphat 


Histidine acid phosphatase 


7.9e- 
159 


537.8 


1 


31-371 


749 


C tripleX 


Cysteine rich repeat 


0.92 


4.2 


1 


52-67 


749 


ApoC-I 


Apolipoprotein C-I (ApoC-1) 


0.83 


3.7 


I 


19o-2oU 


749 


PH 


PH domain 


1.5e-20 


69.0 


1 


393-4o7 


749 


ArfGap 


Putative GTPase activating protein fo 


2.1e-60 


210.7 


1 


COT £.AC\ 


749 


ank 


1/4 797 823.. 7 33 


1.5e-08 


33.7 


2 


ozO-oDo 


749 


ank 


1/4 797 823.. 7 33 


0.0001 


20.0 


3 


859-891 


751 


DUF369 


Domain of unknown function 
(DUF369) 


0.17 


5.8 


1 


one noo 

275-2oo 


751 


KRAB 


KRAB box 


l.le-20 


77.0 


1 




751 


zf-C2H2 


Zinc finger, C2H2 type 


7.8e-06 


28.0 


1 


OU3-0ZD 


751 


TFIIS 


Transcription factor S-II (TFIIS) 


0.78 


5.1 


1 


(Z(\A /CIO 

0U4-0 Y j 


751 


zf-C2H2 


Zinc finger, C2H2 type 


1.6e-05 


26.8 


2 




751 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-07 


33.4 


3 


693-715 


751 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.54 


2.9 


1 


/Uo-/ZO 


751 


TFIIS 


Transcription factor S-II (TFIIS) 


0.63 


5.4 


3 


*701 111 

f 11-131 


751 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-05 


27.2 


4 


/21-/43 


751 


zf-C2H2 


Zinc finger, C2H2 type 


3.4e-08 


37.4 


5 


751-773 


751 


zf-BED 


BED zinc finger 


0.31 


5.8 


1 


752-774 


751 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.032 


6.3 


2 


766-784 


751 


zf-C2H2 


Zinc finger, C2H2 type 


5.7e-06 


28.6 


6 


779-801 


752 


Vpsl6_N 


Vpsl6, N-terminal region 


2.3e- 
273 


918.3 


1 


1-420 


752 


Ribosomal L36 


Ribosomal protein L36 


0.6 


5.0 


1 


245-281 


752 


Fum erase 


Fumarate hydratase (Fumerase) 


f\ *71 

U./l 




i 
l 


376-402 


752 


Peptidase M16 
C 


Peptidase M16 inactive domain 


0.29 


5.2 


1 


492-510 


752 


Vpsl6 C 


Vps 1 6, C-terminal region 


2.4e-15 


57.9 


1 


517-548 


752 


Vpsl6_C 


Vpsl6, C-terminal region 


4.6e- 
128 


435.6 


2 


554-762 


753 


LRRNT 


Leucine rich repeat N-terminal domain 


0.0011 


14.5 


1 


30-59 


753 


XG FTase 


Xyloglucan fucosyltransferase 


0.53 


2.0 


1 


37-48 


753 


LRR 


Leucine Rich Repeat 


0.36 


6.7 


1 


61-82 
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753 


LRR 


Leucine Rich Repeat 


0.0014 


14.8 


2 


83-106 


753 


LRR 


Leucine Rich Repeat 


C T— AC 

5.7e-05 


1 A C 

19.5 


3 


1 AO 111 


753 


LRR 


Leucine Rich Repeat 


2.7e-05 


OA £. 

20.6 


A 

4 


1 1 CC 


753 


LRR 


Leucine Rich Repeat 


0.001 


15.3 


r 
J 


1 C£. 1 OO 

iDo-i /y 


753 


LRR 


Leucine Rich Repeat 


0.0036 - 


13.4 


6 I 


180-203 


753 


LRR 


Leucine Rich Repeat 


0.0016 


1 A £. 

14.6 


7 


OA/1 OOO 

204-zz / 


753 


LRR 


Leucine Rich Repeat 


0.00015 


18.1 


8 


228-251 


753 


LRRCT 


Leucine rich repeat C-terminal domain 


9.7e-12 


37.1 


1 


261-311 


754 


A2M_N 


Alpha-2-macroglobulin family N- 
termin 


4.5e-91 


312.7 


1 


6-613 


754 


Big 1 


Bacterial Ig-like domain (group 1) 


0.62 


3.9 


1 


382-403 


754 


A2M 


Alpha-2-macroglobulin family 


6.2e-64 


214.2 


1 


•"701 A /I A 

721-949 


754 


A2M 


Alpha-2-macroglobulin family 


6.2e- 
132 


444.2 


2 


983- 

1 /I /CO 


754 


PoxJD2 


Pox virus D2 protein 


0.18 


n A 

3.4 


1 


i a a a 
l44o- 

1 A£1 
1401 


755 


DUF904 


Protein of unknown function (DUF904) 


a o i 

0.21 


a o 
O./ 


i 


1 1 (L 10^ 


755 


DUF536 


Protein of unknown function, DUF536 


A A -J 

0.47 


A A 

0.4 


1 


1 AO 1 QO 


755 


Syntaxin 


Syntaxin 


A 1 1 
0.11 


O O 

/.y 


1 


1/^1 1Q7 


755 


fibrinogen C 


Fibrmogen beta and gamma chains, C-t 


1 o« ao 

l . /e-uy 


10 1 
3Z. 1 


i 
i 


OAO 07S 
Z4Z-Z / J 


755 


ftbrinogen_C 


Fibrinogen beta and gamma chains, C-t 


i o« o< 

l.ye-zo 


QA O 
60. / 


z 


Z /i7-4ZZ 


756 


ig 


Immunoglobulin domain 


1.3e-Uo 


1*7 A 
Z/.O 


1 


Ai ino 1 

4 J- 1UZ 


756 


ig 


Immunoglobulin domain 


z.ze-UD 


Ol A 


z 


1 17-1 08 


756 


FYRN 


F/Y-rich N-terminus 


A 


<I 1 
J.J 


1 
1 


1 simoon 


756 


ig 


Immunoglobulin domain 


/z ao 
o.De-Uy 


1/£ O 
30.Z 


•1 
J 


OAO-0 QQ 


756 


ig 


Immunoglobulin domain 


z.Je-U3 


oo O 

zz.y 


4 


JJ7-JOO 


756 


ig 


Immunoglobulin domain 


o n a no 

z.ye-Uo 


3o.o 


c 
D 




756 


ig 


Immunoglobulin domain 


/./e-u/ 


Ofl ^ 


O 


514-S79 


756 


m3 


Fibronectin type III domain 


O "la. 01 

/./e-zj 


O 1. 1 


1 
1 




756 


fh3 


Fibronectin type III domain 


O 1 a. HQ 

y.ie-Uo 


OC *7 
Zo./ 


O 

z 


700-700 


756 


fh3 


Fibronectin type III domain 


O 1a 1 O 


<A A 


1 




756 


fn3 


Fibronectin type III domain 


i.oe-uy 


14 R 
,54.0 


4 
1 


903-986 


757 


LRR 


Leucine Rich Repeat 


A OQ 

u.zy 


"7 A 
/.U 


i 
I 




757 


LRU 


Leucine Rich Repeat 


A AA1 

U.UU3 


1 1 O 
13.7 


o 
z 


76-9Q 


757 


LRR 


Leucine Rich Repeat 


/ „ AC 


zu.u 


o 


100-123 


757 


LRR 


Leucine Rich Repeat 


A AO 1 

u.UZl 


1A Q 


4 


124-147 

lit 1*T / 


757 


LRR 


Leucine Rich Repeat 


Je-ID 


OA A 
ZU.4 


c 


148-171 

1*tO 1 / 1 


757 


LRR 


Leucine Rich Repeat 


A AAA 1 O 


1*7 Q 
1 /.O 


0 


177-1 OS 


757 


FliD 


Flagellar hook-associated protein 2 


A OA 

u.yo 


1 o 

l.Z 


i 
1 


104-900 


757 


LRR 


Leucine Rich Repeat 


A 1 A 
U.lO 


O Q 
/.O 


7 




757 


LRRCT 


Leucine rich repeat C-terminal domain 


O 1« 1 A 

y.Je-iu 


11 A 

31. U 


1 
1 


940-9 RS 


757 


ig 


Immunoglobulin domain 


O A a. AO 

y.4e-uy 


3j.O 


i 
1 




757 


fn3 


Fibronectin type III domain 


A AAA/1 <C 


i ^ o 

o.y 


1 
1 




758 


LRR 


Leucine Rich Repeat 


a *>n 
O.ZV 


^ A 


1 


SO 7S 

JZ- / J 


758 


LRR 


Leucine Rich Repeat 


A AA*3 

0.003 


13.7 


Z 




TCO 

758 


T DD 

LKK 


Leucine tvicn rvepeai 


4e-05 


20.0 


3 


100-123 


758 


LRR 


Leucine Rich Repeat 


0.021 


10.8 


4 


124-147 


758 


LRR 


Leucine Rich Repeat 


3e-05 


20.4 


5 


148-171 


758 


LRR 


Leucine Rich Repeat 


0.00019 


17.8 


6 


172-195 


758 


FliD 


Flagellar hook-associated protein 2 


0.96 


1.2 


1 


194-209 


758 


LRR 


Leucine Rich Repeat 


0.16 


7.8 


7 


196-216 


758 


LRRCT 


Leucine rich repeat C-terminal domain 


9.3e-10 


31.0 


1 


240-285 


758 


ig 


Immunoglobulin domain 


9.4e-09 


35.6 


1 


301-359" 1 


758 


fn3 


Fibronectin type III domain 


0.013 


10.8 


1 


466-500 
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759 


Serendipity_A 


Serendipity locus alpha protein (bKY- 


A K. 


1 i 

Z.3 


t 

i 


5/5- 1UO 


759 


EGF 


EGF-like domain 


0.76 


5.8 


i 


111-133 


759 


SEA 


SEA domain 


4.9e-06 


on 1 

11. 1 


1 


1 /CQ 717 


759 


ig 


Immunoglobulin domain 


9.8e-07 


28.1 


1 


286-352 


759 


AIG2 


AIG2-like family 


0.81 


2.4 


1 


329-340 


759 


ig 


Immunoglobulin domain 


0.33 


7.4 


2 


485-547 


759 


60KD IMP 


60Kd inner membrane protein 


0.64 


3.1 


1 


502-523 


759 


Atracotoxin * 


Delta Atracotoxin 


0.31 


6.4 


1 


628-642 


759 


CAS CSE1 


CAS/CSE protein, C-terminus 


0.28 


5.8 


1 


902-915 


759 


GPS 


Latrophilin/CL-l-like GPS domain 


2e-14 


54.5 


1 


950- 
1002 


759 


7tm_2 


7 transmembrane receptor (Secretin fa 


6.4e-21 


73.4 


1 


1009- 
1273 


759 


ATP-syntJj 


Mitochondrial ATP synthase g subunit 


0.66 


3.9 


1 


1267- 

iz/y 


759 


SH 


Viral small hydrophobic protein 


0.63 


A 1 

4.1 


i 
L 


1 071 ! 

IZ/3- 
17Q7 

izyz 


760 


TFIIS 


Transcription factor S-II (TFIIS) 


a in 
0.2/ 


/C 6 
0.0 


1 


1 07 117 


760 


zf-C2H2 


1/13 93 101 .. 16 24 


U.UU01 3 


11 o 
Z3.Z 


i 
Z 


1H7-1 70 

iu 1-vi.y 


760 


zf-C2H2 


1/13 93 101 .. 16 24 


j.ze-Uo 


zy.o 


i 
3 


1 1 S7 


760 


Ribosomal L19e 


Ribosomal protein L19e 


A CA 

0.59 


3.y 


1 


141-1 61 


760 


TFIIS 


Transcription factor S-II (TFIIS) 


A AQ 


Q A 
0.4 


i 
3 


16^-17^ 


760 


zf-C2H2 


1/13 93 I0l .. 16 24 


1 Go. ac 


76 Q 

zo.y 


A 




760 


XPA N 


XPA protein N-terminal 


A 1*7 

0.3/ 


6 o 
o.z 


1 
3 




760 


TFIIS 


Transcnption factor S-II (TFIIS) 


A AQ 


Q A 

5.4 


A 
H 




760 


zf-C2H2 


1/13 93 101 .. 16 24 


4.3e-Uo 


zy. i 


c 
D 


1Q1-71^ 


760 


XPA N 


XPA protein N-terminal 


Air 

O.Ij 


/.o 


A 
4 


Z LO~Z^7 


760 


TFIIS 


Transcription factor S-II (TFIIS) 


A 1 1 

0.31 


0.4 




910-990 


760 


zf-C2H2 


1/13 93 101 .. 16 24 


1 Aa. A/C 

z.4e-Uo 


1A 1 

3U. 1 


0 


710-741 


760 


XPA N 


XPA protein N-terminal 


(\ AC 

0.4!) 


< O 

D.y 


J 


Z*r*r-ZJO 


760 


TFIIS 


Transcription factor S-II (TFIIS) 


A A1 1 


11 o 

1 l.Z 


O 


947-9^7 


760 


zf-C2H2 


1/13 93 I0l 16 24 


y.ie-u/ 


11 Q. 
31.0 


7 


947-960 

Z*t / -ZrU-' 


760 


XPA N 


XPA protein N-terminal 


A 7Q 

u.zy 


0.0 


O 


979-984 
z /z-^o*t 


760 


zf-C2H2 


1/13 93 I0l 16 24 


O Qa. AO 


3 /.Z 


o 

0 


97S-907 


760 


zf-BED 


BED zinc finger 


A 1 1 


7 1 
/. 1 


i 
j 


976-90R 


760 


zf-C2H2 


1/13 93 101 .. 16 24 


la f\f^ 

3e-Uo 


zy. / 


o 


^0^-^9S 
jv/j-jzj 


760 


TFIIS 


Transcription factor S-II (TFIIS) 


A A1 O 


1A£ 

LU.O 


A 

y 


71U141 


760 


zf-C2H2 


1/13 93 101 .. 16 24 


1 1 a A< 

3. le-uo 


70 7 

zy. / 


l n 




760 


zf-C2H2 


1/13 93 101 .. 16 24 


1 la. A7 

z. /e-u/ 


33. y 


1 1 
1 1 




760 


zf-BED 


BED zinc finger 


A £1 

U.03 


A Q 
4.0 


A 
4 


JUU JOZ 


760 


PqiA 


Paraquat-inducible protein A 


A ^ 


4.U 


i 
z 


^7R-40Q 


760 


XPA N 


9/11 356 366.. 1 11 


A 11 

O.ZZ 


*7 A 

/.u 


1 A 
1U 


^R4-^06 

J0H-J7O 


760 


zf-C2H2 


1/13 93 101 .. 16 24 


O 5a AQ 

o.oe-Uo 


K O 

3j.y 


17 
IZ 




760 


TFIIS 


Transcription factor S-II (TFIIS) 


A A1/C 

0.U3O 


y. / 


17 
1Z 


41 ^-495 


760 


zf-C2H2 


1/13 93 101 .. 16 24 


A AO O 

O.Uzo 


13. / 


1 1 

13 


41 ^-4^7 


Vol 




v^iq aomain 


0.77 


4.7 


1 


104-116 


761 


DUF127 


Protein of unknown function DUF127 


0.81 


2.3 


1 


134-143 


761 


Hydrolase 


haloacid dehalogenase-like hydrolase 


0.53 


4.3 


1 


176-189 


761 


Hydrolase 


haloacid dehalogenase-like hydrolase 


0.27 


5.3 


2 


443-477 


761 


Hydrolase 


haloacid dehalogenase-like hydrolase 


0.65 


4.0 


3 


543-620 


761 


PgpA 


Phosphatidylglycerophosphatase A 


0.96 


2.9 


1 


745-760 


761 


DUF418 


Protein of unknown function (DUF41 8) 


0.15 


6.0 


1 


833-887 


763 


zf-HIT 


HIT zinc finger 


0.21 


8.5 


1 


161-179 


763 


zf-C2H2 


Zinc finger, C2H2 type 


0.0099 


15.5 


1 


170-193 " 
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764 


FHA 


FHA domain 


a ao A 
0.024 


11.6 


1 


A A 


764 


HIT 


HIT domain 


da. ac 


1 < o 




1 Q 1 01^ 

lol-Zjj 


764 


DcpS 


Scavenger mRNA decapping enzyme 
(DcpS 


A AAAA 


8.2 


1 




zlu-z / 1 


764 


DUF369 


Domain of unknown function 
(DUF369) 


A IS 
U.JJ 







910 9^0 


764 


zf-C2H2 


Zinc finger, C2H2 type 


u.uzo 


1 Q 







767 


Cwf_Cwc_15 


Cwfl5/Cwcl5 cell cycle control protei 


o.oe- 

lO 1 






1 «979 


767 


DUF692 


Protein of unknown tunction (jJUroyzj 


A 01 




1 




197-148 


768 


SRCR 


Scavenger receptor cysteine-rich doma 


y.zc-oo 


197 Q 
IZ / .i/ 




^9-199 


768 I 


SRCR 


Scavenger receptor cysteine-rich doma 


6 <>p 1 S 


J*T.Z 


-z 






768 ! 


Lysyl oxidase 


Lysyl oxidase 


1 9f» RA 


978 1 
Z /o. L 





9S1-1S9 


769 


RHS 


RHS protein 


A 8^ 


4 8 






769 


GatB 


PET1 12 family, C terminal region 


0.41 


5.8 


1 


64-86 


769 


Glyco_transf_8 


Glycosyl transferase family 8 


1 0» 1 A 


4A 1 





fiS-997 


769 


Phage holin_4 


Holin family 


A 84 


4 A 


— 


969-282 


770 


WD40 


WD domain, G-beta repeat 


A ^ 
U.J 


6 4 




169-194 


770 


WD40 


WD domain, G-beta repeat 


C Op A£ 


9*^ 8 


-r 


225-251 


770 


DUF130 


Domain of unknown tunction uur 


A 074 




-j 


241-255 


770 


WD40 


WD domain, G-beta repeat 




7 A 


~4 


^74-401 


771 


TPR 


TPR Domain 


A 97 

u.z/ 


7 ^ 


-j 


190-214 


773 


CTP_transf_l 


Cytidylyltransferase family 


j.^e- 

195 
IZJ 


49^ 1 


1 


69-400 


773 


DAG PE-bind 


. ; — 

Phorbol esters/diacylglycerol binding 


A 98 


7 ^ 




166-180 


773 


Pyridox oxidase 


Pyridoxamine 5 -phosphate oxidase 


A ^4 


2.7 




326-334 


773 


KLX 


KIX domain 


A 48 






415-435 


774 


CBM_20 


Starch binding domain 


A 078 


8 5 




86-105 


774 


WD40 


WD domain, G-beta repeat 


^ Op-08 


31.2 




165-203 


775 


TACC 


Transforming acidic coiled-coil-conta 


0 4^ 


3.9 




312-334 


775 


bZIP 


1 /I 1 AQ 79< 48 

1/Z jUo jZj .. *to Oj 


0 39 


5.9 




408-438 


776 


Tweety 


Tweety 


'x 4e-74 


256.6 




21-413 


779 


HesB-like 


HesB-like domain 


9 8p-41 


132.5 




49-151 


780 


ig 


Immunoglobulin domain 


0 01 S 

V/.V 1 J 


12.4 




2-57 


780 


ig 


Immunoglobulin domain 


0 000^^ 


18.6 




96-155 


781 


Mpvl7_PMP22 


Mpvl/ / rMr/z iamuy 


8p-14 

OC" 1*T 


51 5 




129-191 


781 


Adenovirus PX 


Adenovirus late L2 mu core protein (P 


0 fiS 


5 4 




133-152 


782 


sic 


sic protein 


O 1 

V/. I 


3 9 




184-239 


783 


Collagen 


Collagen triple helix repeat (20 copi 


5.5e-07 


28.5 




13-51 


783 


Collagen 


Collagen triple helix repeat (20 copi 


A A44 


10 1 


9 


59-81 


783 


Collagen 


Collagen triple helix repeat (20 copi 


A A14 


19 0 


\ 
J 


86-104 


783 


Collagen 


Collagen triple helix repeat (20 copi 


A A90 


10 7 


4 


106-127 


783 


Collagen 


Collagen triple helix repeat (20 copi 


A A1 1 


19 1 
IZ. 1 


J 


132-150 


783 


Collagen 


Collagen triple helix repeat (20 copi 


A AA 


1 A T 


o 


152-173 


783 


Collagen 


Collagen triple helix repeat (20 copi 


U.U1 J 


19 1 
1Z. 1 


7 


175-196 


Ioj 


v_,oiiagen 


fnllaorpn frinlp VipHy rptlEal f20 COD1 


2.5e-07 


29.8 


8 


198-237 


783 


S- 

AdoMet_syntD3 


S-adenosylmethionine synthetase, C-te 


0.29 


4.1 


1 


232-247 


783 


vwa 


von Willebrand factor type A domain 


1.2e-46 


149.1 




266-448 


783 


Kunitz BPTI 


Kunitz/Bovine pancreatic trypsin inhi 


2.2e-23 


71.1 




540-590 


784 


DUF388 


Domain unknown function (DUF3 88) 


0.047 


8.8 




1-18 


784 


Mtap_PNP 


Phosphorylase family 2 


0.26 


5.1 




1-18 


784 


Sterol desat 


Sterol desaturase 


1.8e-48 


164.1 




57-263 


785 


ig 


Immunoglobulin domain 


0.0011 


16.7 




116-176 
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785 


malic 


Malic enzyme, N-terminal domain 


0.76 i 


2.6 


1 


203-209 


785 


ig 


Immunoglobulin domain 


j.oe-iu 


4u.y 


I 


nil OQ1 


785 


APOBECC 


APOBEC-like C-terminal domain 


0.88 


5.2 


1 


446-474 


785 


enolase 


Enolase, C-terminal TIM barrel domain 


0.2o 


4.4 


1 
1 


OUU-O I 0 


785 


VTA 1 ¥% 1 1 

RNA_pol_Rpbl_ 
5 


RNA polymerase Rpbl, domain 5 


0.65 


1.5 


1 
1 


i ni a 
1034- 

1 AQQ 


785 


ig 


Immunoglobulin domain 


A Oo AC 

4.ye-Uj 


O 1 *7 


1 
J 


1 JOJ- 

1 A1 ^ 
1^-13 


785 


sigma70_r3 


Sigma-70 region 3 


U.O 


J./ 


1 
l 


1401- 
14R1 


785 




— — ; — 

Immunoglobulin domain 


C Oo f\Q 

j,ze-uo 


^9 Q 


A 


1 *5S9- 
16H 


/CO I 




RNA hel icase 


kjna neiicase 


0 000^0 
\j.\j\j\j£ty 


1 ^ 0 


-7 


30-55 


/oo 


AAA 

AAA 


a l rase iamiiy associaiea wun vanoub 






7 


32-48 


/oo 


XT A r*LTT ' 


lNAv^rii oomam 


n 00^9 


19 0 





34-56 


/oo 


AXD UlnA < 
A 1 r-DinO 


f~~*£-\rt Carl TaA Vim tnr\k V» ofl Ool A *T*D KinHirtfT TVF* 

L/Onservea nypotnencai Air oinaing pr 


0 64 

V/.U*t 


■J. J 


-T 


35-46 


/oO 


XTQ ADp 

INd-AKU 


iND-AKv^. aomain 




9 7 


-7 


35-50 


/oo 


AUK 


Adenylate kinase 




10 0 


7 


67-1 14 


/OO 


a rw 
AUK 


Adenylate kinase 




6 9 




127-160 


7oo 


77 


Zinc finger, ZZ type 




8 8 
o.o 


"~7 


146-157 


/oo 


oKrj4 


oKr o*t-type protein, vj i rase aomain 




S Q 




390-408 


/oO 


CT/T 
oJtSJ. 


oniKimate Kinase 


0 19 


6 4 




392-413 


/oo 


ATD kin/4 


i^onservea nypomeucai Air oinamg pr 


V.O J 


^ 1 
j . i 




396-413 


/oo 


ntmo 
KrLLo 


K.001 nair aeiecuve j u i r-oinumg pro 


V.UJ7 






397-411 


/oo 


Croats 


uepnospno-^OA Kinase 


n 19 


6 4 


-7 


402-421 


/oo 


l nymiaylate_ian 


l nymiayiate Kinase 


n 8i 


9 1 


1 


402-418 


788 


SH3 


SH3 domain 


2.3e-14 


55.4 


1 


1-56 


ion 

789 


CLIO 

bri3 


ort? aomain 


i Sf»_i s 

l.Jc-lJ 






73-129 


790 


llMr 


1 issue lnnioitor 01 metaiioproxeinase 


1 Of- RQ 


94^ 0 

Z*TJ.7 


-j 


20-116 


*TAA 

790 


rhytornsyiO 


Phytoreovirus nonstructural protein P 


n aa 


o.u 


-j 


102-108 


791 


DUr/lO 


ramiiy 01 unxnown runcnon ^uur / io) 


V.7J 


^ 4 


"1 


26-54 


791 


DcuC 


C4-dicarboxylate anaerobic carrier 




4 ^ 


-j 


97-48 

Z> /— TO 


791 


rLU Lri 


Floricaula / Leafy protein 


n 99 


9 7 


-7 




127-140 


/y i 


lectin c 


Lecnn c-iype aomain 


1 Qp-07 


"^1 5 




162-267 


lyL 


TTTYDryr 
UUrUl 


uiJr-giucoronosyi ana uL/r-giuco!>yi 
tra 


7 Qp- 

258 


866 7 


-7 


24-Mtl 


/9z 


T>-, v TJQ 

rOX bo 


Poxvirus E8 protein 


n si 

U.O 1 


^ 1 




56-70 


fyZ 


vjiyco tran_z o_L/ 


vjiycosyitransierase idmiiy zo ^-vcniu 


0 06 


7 5 


-: 


292-314 




lKArr oeu 


iranspon protein panic le ^lKArrj co 


1 1p-67 


9^S 0 




6-173 


794 


nox XT' 

rvJVll 


Protein- L-isoaspartate(D- aspartate) 0 


U.wJ J 


1 1 fi 

l l.v 


~~i 


74-113 

/ *t— 1 u 


/94 


U biemetny Itran 


udie/L/Uv^d meinyiiransierase iamiiy 


1 Qp-OS 


18 ^ 

IO. J 


-j 


161-182 


794 


Metnyltranst_o 


Hypothetical methyltransferase 


n (\d 


7 5 


-j 

— 


168-182 


795 


Brix 


Brix domain 


ze-oo 






1-948 
i-^*to 


795 


PDZ 


ruL aomain (^Aiso Known as uriK or 
nx or? 


U.J 


6 ^ 

U.J 


-: 

• 


946-973 


795 

/ yj 


7tm 1 


7 transmembrane receDtor irhodonsin f 


3e-42 


125.3 




444-671 


795 


Lip_A_acyltrans 


Bacterial lipid A biosynthesis acyltr 


0.4 


4.3 




532-558 


795 


ACPS 


4-phosphopantetheinyl transferase su 


0.72 


3.3 




585-600 


796 


ig 


Immunoglobulin domain 


0.0042 


14.5 




33-110 


797 


ig 


Immunoglobulin domain 


0.0042 


14.5 




33-110 


798 


ig 


Immunoglobulin domain 


0.0042 


14.5 




33-110 


798 


FHL 


Flagellar basal body-associated prote 


0.029 


9.2 




170-203 


798 


DcuC 


C4-dicarboxylate anaerobic carrier 


0.044 


7.9 




174-193 


799 


PH 


PH domain 


1.9e-21 


72.0 




14-112 
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— — ' — 


E_value 


Score 


Repeats 


Position 


oUU 


Tf: c. \(i 
111-0- 10 


Interferon-induced 6-16 family 


1 1 a A 1 

i. ie-4i 


T A A A 

144.4 




1 A 0*7 

10-87 


oUU 


OLri 1 


uLli repeat (o copies; 


a 1 Q 
U. lo 


/. / 


— 


14-42 


oaa 
oUU 




GrcB-like protein 


n iq 
U. lo 


*"7 1 

/.I 


— 

-i 


70-88 


QA1 

oUl 


111-0- 10 


Interferon-induced 6-16 family 


1 "la A(L 


1 <Q 1 


1 


1 *"7 AA 

17-99 


on 1 

oil I 




uL TT repeat (p copies) 


U. lo 


7 J 


J : 


26-54 


oUl 




CrcB-like protein 


U. 1 o 


/.I 


1 


OO 1 AA 

82-100 


QAO 

oU2 


ank 


Ankyrin repeat 


i 
1 


J./ 


1 


338-367 


oU2 


RmuC 


Kmuu tamily 


A AQ 


1 A 


-J 


621-657 


oU4 


ig 


Immunoglobulin domain 




1 A A 

19.4 




35-111 


804 


DUr /0o 


Protein of unknown function (DUF708) 


A 0"7 

0.2/ 


c c 

5.6 




oon o A £ 

230-246 


804 


cdcdo 


LEND (hgana-eiiect modulator 3) iami 


a A/(a 
0.049 


0.0 


1 


OO 1 O CO 

231-258 


o azt 

806 


bur 


EGF-like domain 


A A A 1 A 

0.0019 


1 c o 

15.2 


1 


60-95 


OAT 

807 


EGF 


EGF-like domain 


A A A 1 A 

0.0019 


ICO 

15.2 




r~f\ AC 

60-95 


808 


EGF 


EGF-like domain 


0.0019 


15.2 


l 


60-95 


809 


T»¥0 T"»Y>I 1_* 

PI3 PI4 kinase 


T»TL 1 i • J 1 ' 1 <* Jill* 

Phosphatidyhnositol 3- and 4-kinase 


0.89 


3.6 


1 


6-35 


809 




Immunoglobulin domain 


A A azt 

4.9e-06 


25.4 


1 


109-171 


811 


AlphaadaptinC 


Alpha adaptin AP2, C-terminal domain 


0.061 


5.2 


1 


92-104 


Oil 

oil 


MHC 1 


Class I Histocompatibility antigen, d 


A AO 1 

0.02 1 


A 1 

9. 1 


2 


1 OA OAC 

1 20-205 


812 


ig 


Immunoglobulin domain 


3.7e-10 


A A A 

40.9 


2 


78-137 


Oil 

812 


ig 


Immunoglobulin domain 


0.0018 


15.9 


3 


1 <"7Z" Ol O 

176-237 


812 


ig 


Immunoglobulin domain 


I T_ AO 

3.7e-08 


33.4 


4 


00>l OTIC 

274-335 


812 


DNA_pol_B JZ 


DNA polymerase type B, organellar 
ana 


A A 1 O 

0.018 


*7 A 

7.9 


1 


O A1 *1 >1**7 

29 1 -347 


512 


UapA 


Opacity-associated protein A 


(\ A A 

0.44 


o /i 
2.4 


1 


OAA OOO 


812 


ig 


Immunoglobulin domain 


0.0012 


16.6 


5 


369-430 


812 


ig 


Immunoglobulin domain 


n ~i ~ ao 

7. 7e-07 


28.5 


0 


A£LK. COO 

465-529 


813 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.65 


2.6 


1 


55-63 


Oil 

813 


LRRCT 


Leucine rich repeat C-terminal domain 


0.15 


5.9 


1 


Z"1 OC 

61-85 


813 


DUF909 


Bacterial protein of unknown function 


0.4 


5.7 


1 


237-256 


813 


ig 


Immunoglobulin domain 


0.0047 


14.3 


1 


295-358 


813 


ig 


Immunoglobulin domain 


1.2e-08 


35.2 


2 


393-452 


813 


Noll_Nop2_Sun 


NOLl/NOP2/sun family 


0.28 


4.1 


1 


629-671 


813 


ig 


Immunoglobulin domain 


1.2e-05 


24.0 


3 


1468- 
1530 


813 


ig 


Immunoglobulin domain 


l.le-06 


27.9 


4 


1565- 
1627 


813 


ig 


Immunoglobulin domain 


6.2e-09 


36.3 


5 


1662- 
1724 


813 


CD2 


T-cell surface antigen CD2 protein 


0.19 


3.9 


1 


1701- 
1749 


813 


ig 


Immunoglobulin domain 


2.6e-09 


37.7 


6 


1761- 
1823 


813 


ig 


Immunoglobulin domain 


8.7e-06 


24.5 


7 


1858- 
1926 ' 


813 


ig 


Immunoglobulin domain 


3.7e-10 


40.9 


8 


1961- 

O AO A 

2020 


813 


ig 


Immunoglobulin domain 


0.0018 


15.9 


9 


2059- 
2120 


813 


ig 


Immunoglobulin domain 


3.7e-08 


33.4 


10 


2157- 
2218 


813 


DNA_pol_B_2 


DNA polymerase type B, organellar 
and 


0.018 


7.9 


1 


2174- 
2230 


813 


OapA 


Opacity-associated protein A 


0.44 


2.4 


1 


2183- 
2205 
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Repeats 


Position 


813 


ig 


Immunoglobulin domain 


0,0012 


16.6 


11 


2252- 
2313 


813 


ig 


Immunoglobulin domain 


7.7e-07 


28.5 


12 


2348- 


814 


DUF126 


Protein of unknown function DUF126 


A AO 

0.08 


6.1 


1 


1 A 

1-9 


814 


LRRNT 


Leucine rich repeat N-terminal domain 


3e-07 


26.4 


1 


28-56 


814 


LRR 


Leucine Rich Repeat 


0.0074 


12.4 


1 


CO O 1 

5o-ol 


814 


Phage holin 4 


Holin family 


A TO 

0.73 


4.2 


1 


f£r\ oo 
69-00 


814 


LRR 


Leucine Rich Repeat 


0.00054 


16.2 


2 


On 1 AC 

82-105 


814 


LRR 


Leucine Rich Repeat 


0.005 


12.9 


3 


1 nr 1 OA 

106-129 


814 


LRR 


Leucine Rich Repeat 


0.00025 


17.3 


4 


130-153 


814 


LRR 


Leucine Rich Repeat 


0.00088 


15.5 


5 


154-177 


814 


LRR 


Leucine Rich Repeat 


0.0028 


13.8 


6 


186-209 


814 


LRRCT 


Leucine rich repeat Oterminal domain 


2.4e-13 


42.0 


1 


219-280 


814 


DUF909 


Bacterial protein of unknown function 


0.4 


5.7 


1 


432-451 


814 


ig 


Immunoglobulin domain 


0.0047 


14.3 


1 


490-553 


814 


ig 


Immunoglobulin domain 


1.2e-08 


35.2 


2 


588-647 


814 


Noll.Nop2_Sun 


NOLl/NOP2/sun family 


0.28 


4.1 


1 


824-866 


814 


ig 


Immunoglobulin domain 


1.2e-05 


24.0 


3 


1663- 
1725 


814 


ig 


Immunoglobulin domain 


l.le-06 


27.9 


4 


1760- 
1822 


814 


ig 


Immunoglobulin domain 


6.2e-09 


36.3 j 


5 


1857- 
1919 


814 


CD2 


T-cell surface antigen CD2 protein 


0.19 


3.9 


1 


1896- 
1944 


814 


ig 


Immunoglobulin domain 


2.6e-09 


37.7 


6 


1956- 
2018 


814 


ig 


Immunoglobulin domain 


8.7e-06 


24.5 


7 


2053- 
2121 


814 


ig 


Immunoglobulin domain 


3.7e-10 


40.9 


8 


2156- 
2215 


814 


ig 


Immunoglobulin domain 


0.0018 


15.9 


9 


2254- 
2315 


814 


ig 


Immunoglobulin domain 


3.7e-08 


33.4 


10 


2352- 
2413 


814 


DNA_pol_B_2 


DNA polymerase type B, organellar 
and 


0.018 


7.9 




2369- 
2425 


814 


OapA 


Opacity-associated protein A 


0.44 


2.4 


1 


2378- 
2400 ; 


814 


ig 


Immunoglobulin domain 


0.0012 


16.6 


11 


*%a An 

2447- 
2508 


814 


ig 


Immunoglobulin domain 


7.7e-07 


28.5 


12 


2543- 

ZOO/ 


816 


Apolipoprotein 


Apolipoprotein A1/A4/E family 


2.3e-ll 


42.3 


-j 


93-168 


816 


DUF260 


Protein ot unknown tunction uur/ou 


A CA 
U.O i t 


j.j 




94-107 

^*T— IV / 


816 


Adeno PIX 


Adenovirus hexon-associated protein ( 


0.49 


4.4 




95-110 


816 


BcrAD BadFG 


BadF/BadG/BcrA/BcrD ATPase family 


0.12 


6.2 




134-180 


816 


Apolipoprotein 


Apolipoprotein AI/A4/E family 


0.011 


10.5 




172-258 


816 


MM_CoA_mutas 
e 


Methylmalonyl-CoA mutase 


0.84 


1.9 




264-306 


817 


Apolipoprotein 


Apolipoprotein A1/A4/E family 


2.3e-ll 


42.3 




93-168 


817 


DUF260 


Protein of unknown function DUF260 


0.64 


3.5 




94-107 


817 


Adeno PIX 


Adenovirus hexon-associated protein ( 


0.49 


4.4 




95-110 
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Repeats 


Position 


817 


BcrAD BadFG 


BadF/BadG/BcrA/BcrD ATPase family 


0.12 


6.2 


i 


134-180 


817 


Apolipoprotein 


A 1* A. * A 1 / A /I/O f - - ■ - - 1 - - 

Apolipoprotein A 1/A4/E family 


A Al 1 

0.011 


10.5 




172-258 


817 


MM_Co A_m utas 
e 


Methylmalonyl-CoA mutase 


A 0/1 

O.o4 


1 A 

1.9 


1 


264- JUo 


818 


DUF717 


Protein ot unknown tunction (DUr III) 


I 


A A 

4.0 


-J 


i no ni 

lOy-lzl 


818 


MrlCl 


Class 1 Histocompatibility antigen, d 


A /CO 


1 *7 

J.7 


1 


OK 710 

zzo-zjy 


819 


Pox_D5 


Poxvirus D5 protein-like 


1 


2.2 




1 £ oo 

16-28 


819 


phoslip 


Pnosphohpase A2 


j.4e-4y 


172.4 


1 


O 1 1 A C 

21-143 


O 1 A 

819 


KrX__JJiN A_binai 
ng 


KrX DNA-binding domain 


A QA 


A 

Z.y 


1 


500/ 


821 


MK MLb N 


; i — 

Mandelate racemase / muconate lactom 


i .oe-u j 


17 A 
1 /.U 


— 

J 


O 1 17 

y-i iz 


821 


PeptiaaseJSZo 


Signal peptidase 1 


A 1Q 


1 0 


~ 


QA 


821 


PL „ o XT 

LneK JN 


CheR methyl transferase, all-alpha dom 


U.4 


7 
0. / 


-J 


<C 7A 

jo- /4 


on 1 

821 


MR MLb 


Mandelate racemase / muconate lactoni 


o c a AC 

z.je-Uo 


7Q A 

zy.y 


~j 


101 oci 

lyi-zjj 


o22 


XI A D 

INAr 


Nucleosome assembly protein (NAP) 


/;„ i qi 

oe- 1 y i 


044. J 




1 Z-Zo J 


Q70 

oil 


PAT ' 

UAl 


GAT domain 


n *>7 
u.z / 


4.y 


— 


1 1 A 1 7 A 
1 1H-1Z0 


071 

oil 


r\r T13 1 1 c 


Protein of unknown function DUF115 


\). 10 


J.O 


— 

™| 


1 1 A 1 Al 
1 LO-l*tj 


971 

eZJ 


rr/L 


Protein phosphatase 2C 


-2 Ap. no 
j.*fe- /z 


ZJU.U 


-j 


107-lfil 
1U /"jOj 


Q7A 

SZ4 


VWC 


von Willebrand factor type C domain 


z.ze- 1 u . 


17 fi 
J> /.o 


— 


101 1 S7 
ll/J 1 J / 


oZ4 


VWC 


von Willebrand factor type C domain 


A 7o AO 


11 1 
J J. 1 




1/CA OAS 
1 Ov-ZU J 


oZ4 


tit n 

ULa ! 


i lLa go main 


n OA 

U.Zh 


A 1 
O.J 




1 fil 7A0 
Ioj-ZUU 


Q7< 
oZj 


/tm i 


7 transmembrane receptor (rhodopsin f 


1 4*» or 
i.4e-zo 


flA 1 


~T 

~ 


t 171 
1-1 IJ 


OZO 


/tm i 


7 transmembrane receptor (rhodopsin f 


4.3e-4y 


1 A<C 7 
14D. / 




AH 787 


oil 


b(jr 


EGF-like domain 


A AA/C7 


1 1 7 
1 0.Z 





1^ A7 
jj-OZ 


Oil 


DoL 


Delta serrate ligand 


A /l Q 


/l 7 
4. / 





A7 A7 

m-oi 


Q7C 

oZo 


rOX A40 


Poxvirus A46 family 


A ^ 


7 

Z.D 




1-1 J 


C7S 

oZo 


DXOi-J 


Exopolysaccharide synthesis, ExoD 


A fiO 
U.oZ 


7 A 

lM 


-j 

-j 


O'f-O / 


070 
oZo 


KJlOVjAr 


KJiOLiAr aomain 




1 87 A 


-j 


1A1 7«A 

IUI-ZjU 


07 Q 

oZo 


oeco 


Exocyst complex component Sec6 


A 07 

u.y / 


1 fi 
l.o 


-[ 


1 RA_7H7 


Q7Q 

ozy 




lud aomain 


i i » ii 
l.ieoj 


1 1 7 A 
1 1Z.0 


-~ 


^ 1H7 
J-1UZ 


ojy) 




lud aomain 


1 1-M 
1. IC'JJ 


1 17 A 
1 1Z.0 




J-IUZ 


fill 

OJ 1 


m yosin head 


wiyosin neao (motor aomain ) 


o.oe-/o 


7^7 7 
ZJ / .Z 


-j 


17-700 


Q1 1 


Air Dinaz 


P-loop ATPase protein family 


A 1 A 
U.lO 


A 0 


-t 

— 


i^o-i jy 


Q11 
OJl 




Phosphoribulokinase / Uridine kinase 




^ 7 

J.Z 


— 


17R-110 
ixo-i jy 


Q17 

OjZ 


m y o s in_neaa 


Myosin neaa (motor aomain ) 


*f.ie-yi/ 


10A 1 




J /"JO / 


Q17 
OJl 


ATP Ki*n/10 

Air Dinaz 


P-loop ATPase protein family 


U. 10 


A 0 


-j 




ojZ 


PPlf 
rivJv 


Phosphoribulokinase / Uridine kinase 


u, l*t 


^ 7 


-j 

— 


178-130 


814 
oj4 


IXXTi J 


7TM chemoreceptor 


n 17 

U. 1 / 


1 1 
1.1 





3 7- AO 
j t-Hy 


filA 


kazal 


ivazai-type serine protease inniouor 


Q 4f> Afi 


J J. J 




130-181 


814 


tnyrogioouiin_ i 


*l n«#i*/^**1 Anillin 4-\rr\a 1 fart 00 + 

i nyrogioDunn type- 1 repeat 


A 1#»-91 
H. LC-Z1 


RO 1 


1 


316-370 


81^ 


Micro A star 


iviicrovirus a protein 


u. 10 


S 1 
J.J 


-j 




oic 


Coronavirus 5 


Coronavirus gene 5 protein 


A 01 

u.y 1 


i n 

J.U 


-j 




01 < 

oij 


P OCT 


Kr bL repeat 


A QI 


J.H 


-j 


<A(\ ^^A 
JHU-J jU 


OJO 


Micro A star 


Microvirus A* protein 


A 1A 
U. lO 


^ i 


-j 

i — 


41A-47A 
*f 1U-HZ0 


oi£ 


Coronavirus 5 


Coronavirus gene 5 protein 


A 01 

u.y i 


1 A 
J.U 




CA(\ «<1 
J*fU-JJj 


836 


RPFT 

sSJTE/L* 




0 81 

v/.O 1 


S 4 
j .*t 


-: 


540-550 


837 


BEX 


Brain expressed X-linked like family 


9.8e-86 


266.4 




14-125 


837 


ChaC 


ChaC-like protein 


0.2 


4.5 




67-92 


837 


HvC 


Acetohydroxy acid isomeroreductase, c 


0.14 


5.9 




68-97 


838 


LRRNT 


Leucine rich repeat N-terminal domain 


4.1e-05 


19.3 




31-59 


838 


LRR 


Leucine Rich Repeat 


0.045 


9.7 




61-84 


838 


LRR 


Leucine Rich Repeat 


0.0026 


13.9 


3 


109-132 


838 


LRR 


Leucine Rich Repeat 


0.002 


14.3 


4 


133-156 


838 


LRR 


Leucine Rich Repeat 


0.0034 


13.5 


5 


157-180 
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ITIUIICA 
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Score 


Repeats 


Position 


838 


LRR 


■Leucine Kicn ivepeai 


A f\f\f\ 1 O 

u.oooiy 


17.8 


6 


181-204 


838 


LRR 


1 Angina D</tU Don^ot 

Leucine Kicn Repeat 


iC Aa a< 


1 A O 

19.3 


7 


205-228 


838 

O J o 


T RR 


Leucine Rich Repeat 


O A ~ AC 

3.4e-05 


20.2 


8 


229-252 


838 

OJO 


r rr 


Leucine Rich Repeat 


A CA 

0.59 


6.0 


9 


253-276 


838 


LRR 


Leucine Rich Repeat 


9.3e-05 


18.8 


10 


277-300 


OJO 


r rr 


Leucine Rich Repeat 


0.0022 


14.1 


11 


301-324 


CIS 
OJO 


Scram blase 


Scramblase 


0.76 


1.7 


1 


313-322 


OJO 


T RR 


Leucine Rich Repeat 


0.0001 


18.6 


12 


326-349 


OJO 




Leucine rich repeat C-terminal domain 


4.3e-13 


41.2 


1 


359-405 


OJO 


Ur rUl 15 


Domain of unknown function DCJF20 


1 


2.9 


1 


533-556 




ou i rase 


uU lrase 


0.34 


6.2 




343-362 


841 


ank 


Ankyrin repeat 


0.00082 


16.7 


i 


1-27 


o41 


MM_CoA_mutas 
e 


Methylmalonyl-CoA mutase 


0.85 


1.9 


1 


9r43 


o41 


ank 


Ankyrin repeat 


7.1e-07 


27.7 


2 


29-61 


0/1 1 

841 


ank 


Ankyrin repeat 


2.3e-09 


36.6 


3 


130-162 


o4l 


ank 


Ankyrin repeat 


2.2e-10 


40.3 


4 


164-196 


841 


Myc_N_term 


Myc amino-terminal region 


0.27 


3.6 




514-541 


QA 1 

o4I 


oAM 


SAM domain (Sterile alpha motif) 


1.3e-06 


25.0 




588-640 


842 


DUF370 


Domain of unknown function 
(DUF370) 


1 


3.5 


1 


21-36 


o4Z 


ApOL 


Apolipoprotein L 


3.1e- 
195 


658.7 


1 


43-345 


8/1? 


Lr„ n Li r* 


— - — - — - — — 

HupH hydrogenase expression protein, 


0.99 


2.7 


1 


116-131 




UUr / 1U 


ramily of unknown function (DUF710) 


0.48 


5.0 


1 


297-337 


o^tj 


uur j /u 
— — 


Domain of unknown function 
\UUr j /v) 


1 


3.5 


1 


21-36 


Of J 


ApoL 


Apolipoprotein L 


1.7e- 
194 


656.3 


1 


43-345 


843 

Q"TJ 


HunH C 


_ — — ; ; 

HupH hydrogenase expression protein, 


A AA 

0.99 


2.7 




116-131 


843 


DT IF710 

L»U1 / IV/ 


Family of unknown function (DUF710) 


A A O 

0.48 


5.0 




297-337 


844 


T rtf*rAtr1 aaiti 
U ivl UglvsU 1 1 1 


uierogiooin iamuy 


1 


3.3 


1 


1-16 


844 


DUF84 


rroiein or unKnown runction iJUro4 


A AAO 


5.9 


1 


8-22 


844 


DTJF960 


oiapnyiococcai protein or unknown run 


A lO 

U./o 


3.7 




38-63 


844 


Tail X 


P ha cm* Tail Pr/-\fp>im Y 

rnage xaii rrotein a 


U.jj 


c o 

5.8 




Ac c r 

45-56 


844 


Lyslvl 


Lfysivi domain 


A 1/Z 

U.Jo 


£. A 

6.9 


1 


AO C £ 

48-56 


844 




TtYl lift, im A/tl /%ni f 1 f -n r\ m'n 

lmmunogioouiin aomain 


3e-U7 


OA /\ 

30,0 


1 


53-110 


844 


iff 


lmmunogioouiin uomain ; 


1 Q« A*7 

l.oe-U7 


30.9 


2 


150-216 


844 


ip 
l B 


lmmunogioouiin aomain 


o o«i no 


33.8 


3 ! 


255-310 


844 


in 


lmmunogioouiin aomain 


A /f« AT 


29.3 


4 


350-417 


845 




uterogioom iamuy 


1 

1 


3.3 


1 


1-16 


845 


DIIF84 


rrotein oi unicnown runction UUro4 


A AAO 

0.098 


C A 

5.9 


1 


8-22 


O^rJ 




Staphylococcal protein of unknown fun 


A lO 

0.78 


3.7 


1 


38-63 


845 

0*T J 


Tail V 


rnage laii rrotein a 


0.35 


5.8 


1 


45-56 


845 


T vqM 

JUyMVA 


LysM domain 


A 1^ 

0.36 ! 


6.9 




48-56 


845 


Ig 


Immunoglobulin domain 


3e-07 


30 0 

Jv.U 


\ 


53-1 in 

J J- 1 1U 


845 


ig 


Immunoglobulin domain • 


L8e-07 


30.9 


2 


150-216 


845 


ig 


Immunoglobulin domain 


2.9e-08 


33.8 


3 


255-310 


845 


ig 


Immunoglobulin domain 


4.6e-07 


29.3 


4 


350-417 


845 


»g 


Immunoglobulin domain 


l.le-07 


31.6 


5 


456-516 


845 


ig 


Immunoglobulin domain 


8.8e-05 


20.8 


6 


553-617 


845 


APS kinase 


Adenylylsulphate kinase 


0.67 


2.8 


1 


593-609 


845 


fh3 


Fibronectin type III domain 


0.75 


4.7 


1 


656-733 


845 


MAM 


MAM domain 


6.7e-77 


265.6 


1 


753-918 
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ID 
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E_value 


Score 


Repeats 


Position 


845 


E2FTDP 


Transcription factor E2F/dimerisation 


0.56 


3.7 


1 


761-787 


846 


zf-PARP 


Poly(ADP-ribose) polymerase and 
DNA-L 


0.61 


5.0 


I 


38-54 


846 


Albicidin res 


Albicidin resistance domain 


0.49 


6.1 


I 


290-297 


846 


SPDY 


Domain of unknown function 
(DUF317) 


0.37 


5.2 


1 


361-374 


846 


CBF 


CBF/Mak21 family 


0.00014 


14.4 




417-450 


847 


CNH 


CNH domain 


0.00087 


13.7 


1 


164-217 


847 


NHL 


NHL repeat 


0.14 


9.4 




204-229 


847 


Coprogen_oxidas 


Coproporphyrinogen III oxidase 


0.26 


1.9 


~ 


231-246 


847 


Clathrin 


Region in Clathrin and VPS 


0.0094 


11.5 




404-445 


847 


ENTH 


ENTH domain 


0.31 


5.7 


1 


794-807 


847 


C2 


C2 domain 


2.2e-l8 


63.6 


1 


797-876 


847 


PLA2_B 


Lysophospholipase catalytic domain 


9.1e-51 


178.0 


1 


1108- 
1317 


847 


DUF188 


Uncharacterized BCR, Yail/YqxD 
family 


0.9 


2.9 


1 


1314- 
1325 


847 


TAP42 


TAP42-like family 


1 ' 


2.0 


1 


1408- 
1413 


847 


PLA2_B 


Lysophospholipase catalytic domain 


1.2e-12 


43.6 




1429- 
1551 


848 


ENTH 


ENTH domain 


0.31 


5.7 


1 


43-56 


848 


C2 


C2 domain 


2.2e-18 


63.6 


1 


46-125 


848 


PLA2 B 


Lysophospholipase catalytic domain 


2.4e-53 


187.1 


1 


357-566 


848 


DUF188 


Uncharacterized BCR, Yail/YqxD 
family 


0.9 


2.9 


1 


563-574 


848 


TAP42 


TAP42-iike family 


1 


2.0 




657-662 


848 


PLA2 B 


Lysophospholipase catalytic domain 


1.2e-12 


43.6 




678-800 


849 


SNF7 


SNF7 


1.3e-54 


191.6 


-} 


18-178 1 


849 


GatBN 


PET1 12 family, N terminal region 


0.2 


4.6 




135-146 


849 


Interleukin 13 


Interleukin- 13 


0.24 


6.5 




156-167 


850 


p450 


Cytochrome P450 . 


2.9e-05 


15.6 




25-112 


850 


Phage_attach 


Phage Head-Tail Attachment 


0.97 


1.6 


I 


69-80 


851 


ig 


Immunoglobulin domain 


8e-09 


35.9 




48-105 


851 


ig 


Immunoglobulin domain 


1.5e-12 


49.8 




169-227 


851 


ig 


Immunoglobulin domain 


2.3e-06 


26.7 




265-344 


851 


CD36 


CD36 family 


0.38 


3.9 


1 


377-402 


851 


Neur chan mem 
b 


Neurotransmitter-gated ion-channel tr 


0.69 


2.3 


1 


392-401 


852 


ig 


Immunoglobulin domain 


8e-09 


35.9 


1 


44-101 


852 


ig 


Immunoglobulin domain 


1.5e-12 


49.8 




165-223 


852 


i? 


Immunoglobulin domain 


2.3e-06 


26.7 




261-340 


852 


CD36 


CD36 family 


0.38 


3.9 


-\ 


373-398 


852 


Neur chan mem 
b 


Neurotransmitter-gated ion-channel tr 


0.69 


2.3 




388-397 


853 


ig 


Immunoglobulin domain 


8e-09 


35.9 




44-101 


853 


bZIP Maf 


bZIP Maf transcription factor 


0.4 


4.3 




101-127 


854 


C2 


C2 domain 


1.8e-39 


134.8 




158-245 


854 


C2 


C2 domain 


8.3e-37 


125.8 




289-377 


855 


DUF1058 


Protein of unknown function 
(DUF1058) 


0.49 


2.3 




79-92 


855 


Pep_M12B_prop 
ep 


Reprolysin family propeptide 


7.2e-06 


18.8 




154-222 


855 


Reprolysin 


Reprolysin (M12B) family zinc metallo 


9.5e-18 


66.0 


2 


313-456 
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ID 
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DfiscrinHnn 


1? voln a 
Cj VdlUC 


To 

Score 


1 

Repeats 


Position 


855 


Mu-conotoxin 


Mu- Conotoxin 


0.94 


4 6 


-j 


JJOO / / 


855 


Astacin 


Astacin (Peptidase familv M12A i 


0 65 


^ 4 

J.H 


— 


1QQ 4A7 

joy-^uz 


855 


fn2 


Fibronectin type II domain 


0.59 


3.4 


1 


445-451 


855 


tsp 1 


i iituiiiuubpunuin type i uum<tin 


JC- 1 O 


jj.y 




546-596 


855 


ADAM_spacerl 


ADAM-TS Spacer 1 


1.6e-49 


174.7 


1 


702-813 


855 


DSL 


ucita serraie iigana 


U.jo 


j.U 




794-812 


855 


tsp_i 


Thrombospondin type 1 domain 


0.0007 


14.6 


2 


832-844 


OJJ 


il n I 1 


NF-X1 type zinc finger 


A AAT1 
U.OU / 1 


o o 

8.8 


2 


873-895 


855 


tsp_l 


Thrombospondin type I domain 


0.0028 


12.7 


3 


888-909 


R^ 


ten 1 


Thrombospondin type 1 domain 


8.2e-08 


27.8 


4 


945-995 


855 


Reo_sigmaC 


Reovirus sigma C capsid protein 


0.73 


2.0 


1 


1216- 
1224 


OJJ 


UrrwJ 1 


uncnaracterizea protein tamily (Ur rUO 


0.0073 


8.9 


1 


1284- 
1297 


855 


tsp_l 


Thrombospondin type 1 domain 


0.01 


10.8 


5 


1321- 
1364 


rss 

OJJ 


ten 1 


— — : 

Thrombospondin type I domain 


U.UU37 
1 


111 

12.3 


7 


1429- 

1 ATX 
14/1 


OJJ 


fen 1 
to[J 1 


i nrornuosponain type i domain 


1 Ao. A^ 

j.4e-uj 


1Q A 


Q 

o 


1 AHA 

14/4- 

1 ^1A 
IjjU 


856 


Ifi-6-16 


TtiterfprAn-iridiippH 6-16 familv 


J. JC-U / 


9fi 9 


— 


Z l~'rl 


856 


CRCB 


{"YrR-likp nrAtpin 


v. 1 0 


7 1 
/. \ 





z/-*o 


857 


GHMP kinases 


CrHA/TP In'naQpQ mifativp ATP-Kinrlincr 
vjilivix JVliioos/O pllulll Vv r\ I X UlliUlilg 

pro 


\j. jj 


1 Q 


-j 


RI 190 


857 


ahh vdr ol a <;p 


alrMia/npta IivaVaIucp frilH 
aijJiia/ ut/la liyuiUlaaU 1U1U 




0 9 




101-z i*t 


857 


lipase 


Lipase 


0.64 


3.7 


1 


185-213 


857 

O J / 


a hli \/A m 1 a c a 


aipnd/ucid nyuroiase ioiu 


A AAC1 
U.uUoj 


1A ^ 
IU. J 


Jl 


n^A iia 


O J / 


nr r 


uicncidcione nyaroiase iamny 


A /I 


J.O 




7</r OQ1 

ZjO-ZoJ 


857 

O J / 


t rp 

jLfir 


occiciory lipase 


n nn 
u.uiz 


O.O 


-J 

-j 


O/TC OA A 

zoj-zyu 


857 




uiiuiiardLicriocu protein iamny ^Ux ruz 


A ^fi 
U.JO 


A 0 




zoo-zyo 


857 


ahhvHrnla^p 9 


r llUapilUU paoC/V^MDUX.yiCo[CiaSc 


A A1 ^ 
U.Ul J 


1A 1 
1U. I 





7/^7 7QA 

zo /-zyu 


857 


Pentidase Ml ft 
N 


ivialilA lUClaUUpruiCaaC, iN-lCIIIlllial QO 


A £1 
U.Oj 


9 ^ 
Z.J 


-j 


90A 117 

zyo-j 1 / 


858 


GHMP kinases 


GHMP kinases nutative ATP-hinHina 

VJXXITIX IVlliaoksO JJUlcxXlYl' All UlILUlllg 

pro 


U.J J 


1 0 


-j 


74-199 


858 


abhvdrolase 


fllnha/heta hvHrn1a«5e fnld 


0 01 


0 9 




1 ^4-9 ft7 


858 


lipase 


TJoase 




j. / 


-j 


17R-9ftfi 
1 / o-^uo 


858 


abhvdrolase 


aloha/heta hvHrola^p fnM 


U.UWOJ 


1ft <J 

I U.J 


-5 


947-1 1 7 
z*t /-j 1 / 


858 


DLH 


Dienelaptone hvdrnla^p familv 

i-S i\-ill*^ia.\jl.\JlL\*/ Hy\j.l Ulaot XcUlllljr 


fl 4 

U.*T 


J.O 


-j 


940-976 


858 


LIP 


Spprptnrv lina^p 


0 019 


O.O 


-j 


7<0 

Z.JO-Z6J 


858 


UPF0227 


T Incfiaractpri<ipH nrntpin Familv frTPl?ft9 


U.JO 


4 Q 


-j 


9^0-9RO 


858 


abhvdrolase 1 


PhnQnlinltnasp/OarhriYvlpctpracp 
x iiwopFiKJiipaow \_rcu uuayigoigi aov 


ft OK 

V/.U 1 J 


lft 1 

IU. 1 


— 


96ft-9Rl 


858 


Pentidase M10 
N 


lvfafriY mptallnnrntpacp \T-tF»rmir>Ql An 

IVlalllA lIlClclllVJJJI ULCaoC, IN - IC1 Ulllld.1 UU 


ft 6^ 


9 ^ 
Z.J 


-j 


9RQ 11ft 
zoy-j iU 


859 


H-kinase dim 


Signal transdncinp' histiHinp kinacp 

Olgllal U CUlOUUWlllg lllollulllw AJlldoC) 


ft 9S 


O.O 


— 


1 j-jj 


859 


Collagen 


Collagen triple helix repeat (20 copi 


4.8e-08 


32.5 


x 


244-284 


859 


Collagen 


Collagen triple helix repeat (20 copi 


3.3e-05 


21.8 




285-320 


859 


SRCR 


Scavenger receptor cysteine-rich doma 


6.6e-22 


78.9 




336-433 


859 


MBD 


Methyl-CpG binding domain 


0.52 


4.9 




365-389 


860 


CobS 


Cobalamin-5-phosphate synthase 


0.43 


3.4 




45-58 


860 


LGT 


Prolipoprotein diacylglyceryl transfe 


0.084 


6.6 




64-85 


860 


Collagen 


Collagen triple helix repeat (20 copi 


2.6e-07 


29.7 




304-344 


860 


Collagen 


Collagen triple helix repeat (20 copi 


3.3e-05 


21.8 


2 


345-380 


860 


SRCR 


Scavenger receptor cysteine-rich doma 


2.7e-34 


122.7 


1 


396-493 
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Repeats 
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862 


TFIIS 


Transcription factor S-II (TFIIS) 


0.73 


5.1 


1 


192-202 


862 


zf-C2H2 


Zinc finger, C2H2 type 


3.5e-05 


25.4 


1 


192-214 


862 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-06 


31.2 


2 


220-242 


862 


zf-BED 


BED zinc ringer 


0.33 


5.7 


1 


222-243 


862 


mRNA_cap_enzy 
me 


mRN A capping enzyme, catalytic 
domain 


0.56 


0.5 


1 


245-260 


862 


XPA N 


XPA protein N-terminal 


0.78 


5.1 


2 


245-257 


862 


zf-C2H2 


Zinc finger, C2H2 type 


2,9e-07 


33.8 


3 


248-270 


862 


TFIIS 


Transcription factor S-II (TFIIS) 


0.89 


4.8 


3 


276-286 


862 


zf~C2H2 


Zinc finger, C2H2 type 


2e-06 


30.4 


4 


276-298 


862 


zf-C2H2 


Zinc finger, C2H2 type 


1.6e-05 


26.8 


5 


304-326 


862 


mRNA cap_enzy 
me 


mRNA capping enzyme, catalytic 
domain 


0.56 


0.5 


2 


329-344 


862 


XPA N 


XPA protein N-terminal 


0.78 


5.1 


4 


329-341 


862 


zf-C2H2 


Zinc finger, C2H2 type 


5.4e-07 


32.7 


6 


332-354 


862 


TFIIS 


Transcription factor S-II (TFIIS) 


0.29 


6.5 


5 


360-370 


862 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-06 


31.5 


7 


360-382 


862 


XPA N 


XPA protein N-terminal 


0.13 


7.8 


6 


385-397 


862 


TFIIS 


Transcription factor S-II (TFIIS) 


0.57 


5.5 


6 


388-398 


862 


zf-C2H2 


Zinc finger, C2H2 type 


9.2e-07 


31.8 


8 


388-410 


862 


XPA N 


XPA protein N-terminal 


0.97 


4.8 


7 


413-425 


862 


TFIIS 


Transcription factor S-II (TFIIS) 


0.14 


7.6 


7 


416-426 


862 


zf-C2H2 


Zinc finger, C2H2 type 


4.4e-06 


29.1 


9 


416-438 


862 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.38 


3.3 


1 


428-449 


862 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-06 


31.5 


10 


444-466 


862 


TFIIS 


Transcription factor S-II (TFIIS) 


0.054 


9.0 




472-482 


862 


zf-C2H2 


Zinc finger, C2H2 type 


2.9e-07 


33.8 


11 


472-494 


862 


zf-BED 


BED zinc finger 


0.64 


4.8 




477-495 


862 


DC1 


1 If\ ***** A On 4 A A A 

1/2 472 487.. 19 44 


0.16 


6.2 




500-515 


862 


zf-C2H2 


Zinc finger, C2H2 type 


0.00082 


19.9 


12 


500-523 


863 


Dorl 


Dorl-like family 


7e-203 


684.1 


1 


197-553 


863 


bZIP 


bZIP transcription factor 


0.3 


6.3 


1 


224-246 


864 


Ul-C 


Ul small nuclear ribonucleoprotein C 


0.00024 


16.9 


1 


2-51 


864 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H type (an 


2.2e-09 


33.8 


1 


52-78 


865 


WD40 


WD domain, G-beta repeat 


4.2e-08 


31.1 


1 


202-238 


865 


WD40 


WD domain, G-beta repeat 


0,54 


6.3 




282-307 


866 


Felsl 


Fels-1 Propage Protein-like 


0.61 


5.8 


1 


361-376 


867 


aminotran 3 


Aminotransferase class-Ill 


1.5e-40 


134.4 


1 


95-214 


867 


OATP N 


Organic Anion Transporter Polypeptide 


0.81 


4.0 




240-258 


867 


aminotran 3 


Aminotransferase class-Ill 


8.9e-66 


218.5 




281-509 


868 


aminotran 3 


Aminotransferase class-Ill 


1.2e-09 


31.3 


J 


52-111 


868 


OATP N 


Organic Anion Transporter Polypeptide 


0.81 


4.0 


1 


137-155 


868 


aminotran 3 


Aminotransferase class-in 


8.9e-66 


218.5 




178-406 


869 


trypsin 


Trypsin 


4.5e-71 


220.5 


1 


63-289 


870 


Glycos_transf_l 


Glycosyl transferases group 1 


1.7e-17 


64.4 


1 


144-239 


O II 


Mrli 1 


Bacterial signalling protein N termin 


A C 
0.0 


4.Z 




291-328 


873 


EGF 


EGF-like domain 


2.9e-07 


28.9 




7-43 


873 


laminin EGF 


Laminin EGF-like (Domains III and V) 


1 


4.3 




21-43 


873 


EGF 


EGF-like domain 


9,2e-10 


38.0 


2 


50-81 


873 


EGF 


EGF-like domain 


1.2e-07 


30.3 


3 


88-119 


873 


EGF 


EGF-like domain 


2.7e-ll 


43.5 


4 


126-157 


873 


EGF 


EGF-like domain 


5e-ll 


42.5 


5 


168-199 


873 


DSL 


Delta serrate ligand 


0.32 


5.2 


3 


190-199 


873 


EGF 


EGF-like domain 


0.0091 


12.7 


6 


209-234 
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873 


EGF 


EGF-like domain 


0.022 


11.3 


7 


243-267 


873 


EGF 


EGF-like domain 


5e-09 


35.3 


8 


280-311 


873 


EGF 


EGF-like domain 


1.3e-07 


30.2 


9 


319-350 


873 


Cripto 


Cripto growth factor 


0.11 


6.4 


2 


324-351 


873 


EGF 


EGF-like domain 


8.2e-ll 


41.8 


10 


358-389 


873 


Cripto 


Cripto growth factor 


0.00049 


14.6 


3 


363-390 


873 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.042 


9.1 


5 


378-390 


873 


EGF 


EGF-like domain 


4.6e-08 


31.8 


11 


396-427 


873 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.25 


6.4 


6 


416-427 


873 


sushi 


Sushi domain (SCR repeat) 


1.5e-06 


28.7 


1 


433-486 


873 


EGF 


EGF-like domain 


8.7e-09 


34.5 


12 


492-523 


873 


EGF 


EGF-like domain 


3.9e-09 


35.7 


13 


530-561 


873 


EGF 


EGF-like domain 


1.2e-07 


30.4 


14 


568-599 


873 


granulin 


Granulin 


1 


3.6 


2 


596-608 


873 


EGF 


EGF-like domain 


2.9e-07 


29.0 


15 


606-637 


873 


DSL 


Delta serrate ligand 


0.69 


4.1 


9 


627-637 


873 


fh3 


Fibronectin type III domain 


1.3e-10 


38.6 


1 


641-722 


873 


fn3 


Fibronectin type III domain 


8e-12 


42.8 


2 


740-823 


873 


fh3 


Fibronectin type III domain 


1.2e-12 


45.7 


3 


839-921 


873 


EGF 


EGF-like domain 


5.8e-10 


38.7 


16 


1046- 
1077 


873 


Cripto 


Cripto growth factor 


0.047 


7.7 


5 


1051- 
1078 


875 


AdoHcyase 


S-adenosyl-L-homocysteine hydrolase 


2.2e-68 


222.4 


1 


81-217 


875 


AdoHcyase 


S-adenosyl-L-homocysteine hydrolase 


1.8e-55 


180.1 


2 


218-507 


875 


AdoHcyase NA 
D 


S-adenosyl-L-homocysteine hydrolase, 


2.2e- 
106 


363.6 


1 


267-428 


875 


TrkA-N 


TrkA-N domain 


0.023 


10.7 


1 


291-322 


875 


GlutR_NAD_bin 
d 


Glutamyl-tRNAGlu reductase, NAD(P) 
bi 


0.086 


8.1 


2 


337-353 


876 


UQ_con 


Ubiquitin-conjugating enzyme 


0.0058 


11.9 


1 


47-77 


877 


Prominin 


Prominin 


0 


1616. 
6 


1 


18-823 


877 


SPDY 


Domain of unknown function 
(DUF317) 


0.15 


6.5 


1 


80-93 


877 


DUF705 j 


Protein of unknown function (DUF705) 


0.98 


1.9 


1 


555-565 


878 


fibrinogen_C 


Fibrinogen beta and gamma chains, C-t 


7.6e-56 


190.6 


1 


146-382 


879 


fibrinogenC 


Fibrinogen beta and gamma chains, C-t 


7.6e-56 


190.6 


1 


146-382 


880 


fibrinogen_C 


Fibrinogen beta and gamma chains, C-t 


7.6e-56 


190.6 


1 


146-382 


881 


DUF846 | 


Eukaryotic protein of unknown functio 


0.094 


4.8 


1 


83-113 


882 


DUF381 


Domain of unknown function 
(DUF381) 


0.48 


4.4 


1 


29-35 ! 


883 


Trp_Tyr_perm 


Tryptophan/tyrosine permease family 


0.0026 


10.3 


1 


42-63 


883 


aa_permeases 


Amino acid permease 


8.4e-32 


115.8 


1 


48-371 


883 


Pox 15 


Poxvirus protein 15 


0.24 


6.0 


1 


162-179 


ooi 


serine_carbpept 


Serine carboxypeptidase 


A A 1 

0.41 


2.3 


1 


378-398 


884 ! 


pkinase 


Protein kinase domain 


6.3e-09 


32.0 


I 


100-150 


884 


CtsR 


Firmicute transcriptional repressor o 


0.61 


3.9 


1 


146-157 


884 


pkinase 


Protein kinase domain 


1.3e-07 


27.2 


2 


151-181 


884 


Pox ser-thr kin 


Poxvirus serine/threonine protein kin 


0.31 


3.8 


1 


165-176 


884 


Herpes_UL3 


Herpesvirus UL3 protein 


0.72 


4.0 


1 


338-383 


884 


pkinase 


Protein kinase domain 


0.00084 


13.7 


3 


444-495 


884 


pkinase 


Protein kinase domain 


2.1e-05 


19.4 


4 


604-659 


885 


lectin c 


Lectin C-type domain 


9.9e-10 


40.5 


1 


47-107 
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886 


spectrin 


Spectrin repeat 


0.4 


5.5 


1 


1042- 
1095 


887 


spectrin 


Spectrin repeat 


0.4 


5.5 


1 


1042- 
1095 


888 


Peptidase_M20 


Peptidase family M20/M25/M40 


5.2e-24 


86.6 


1 


55-295 | 


889 


sugarjr 


Sugar (and other) transporter 


0.11 


5.5 


1 


47-103 


889 


Octopine__DH 


NAD/NADP octopine/nopaline 
dehydrogen 


0.26 


4.6 


1 


153-169 


889 


sugarjr 


Sugar (and other) transporter 


3.7e-08 


28.6 




201-335 


890 


T4 deiodinase 


lodothyronine deiodinase 


0.37 


4.0 


1 


168-179 


891 


ig 


Immunoglobulin domain 


8.5e-07 


28.3 


1 


55-127 


891 


denso VP4 


Capsid protein VP4 


0.38 


2.7 


1 


57-69 


892 


bromodomain 


Bromodomain 


9.5e-45 


158.8 




63-152 


892 


bromodomain 


Bromodomain 


3e^0 


143.5 


\ 


356-445 


892 


Alpha adaptin C 


Alpha adaptin AP2, C-terminal domain 


0.48 


2.6 


1 


395-407 


892 


Phage X 


Phage X family 


0.97 


3.7 




438-469 


892 


eIF3c N 


Eukaryotic translation initiation fac 


0.51 


1.2 


\ 


473-559 


892 


VitellogeninJM 


Lipoprotein amino terminal region 


0.61 


1.5 


1 


484-539 


892 


Herpes_U44 


Herpes virus U44 protein 


0.47 


3.1 


1 


515-529 


892 


MAGP 


Microfibril-associated glycoprotein ( 


0.82 


2.7 


1 


919-958 


893 


Pox A type inc 


Viral A-type inclusion protein repeat 


0.23 


7.6 


1 


197-216 


893 


OLF 


Olfactomedin-like domain 


4.6e- 
121 


412.4 


1 


220-470 


893 


Phage JC 


Phage X family 


0.57 


4.5 


1 


362-389 


893 


Peptidase MiO 
N 


Matrix metalloprotease, N-terminal do 


0.86 


2.1 


1 


373-383 


893 


FeThRed_B 


Ferredoxin thioredoxin reductase cata 


0.96 


2.3 




377-393 


894 


kazal 


Kazal-type serine protease inhibitor 


1.7e-10 


44.0 


! 


88-132 


894 


efhand 


EF hand 


2.2e-05 


23.3 


1 


178-206 


894 


ig 


Immunoglobulin domain 


6.4e-06 


25.0 


1 


262-322 


894 


ig 


Immunoglobulin domain 


2e-09 


38.2 




354-414 


894 


SsgA 


Streptomyces sporulation and cell div 


0.35 


5.9 


1 


541-549 


895 


aminotran 1 2 


Aminotransferase class I and II 


7.5e-20 


71.8 


1 


81-257 


895 


DegT DnrJ Ery 
CI 


DegT/Dnr J/EryC 1/StrS 
aminotransferase 


1 


2.4 


1 


158-178 


895 


TPP_enzymes_C 


Thiamine pyrophosphate enzyme, C- 
term 


0.35 


3.3 




258-279 


896 


LIM 


LIM domain 


9.9e-09 


32.9 


i 


24-80 


896 


LIM 


LIM domain 


2e-13 


49.7 




83-134 


896 


LIM 


LIM domain 


5.3e-19 


69.5 




153-209 


896 


DUF866 


Eukaryotic protein of unknown functio 


0.035 


7.5 




178-199 


896 


'LIM 


LIM domain 


7.5e-07 


26.3 




212-253 


896 


VHP 


Villin headpiece domain 


4.6e-25 


77.5 




538-573 


897 


LytTR 


LytTr DNA-binding domain 


0.051 


9.5 


\ 


14-49 


897 


COX4 


Cytochrome c oxidase subunit IV 


0.61 


4.7 


1 


188-207 


897 


pkinase 


Protein kinase domain 


2.9e- 
102 


349.9 




356-613 


897 


TMP 


TMP repeat 


0.37 


8.0 




579-589 


898 


DCX 


Doublecortin 


1.4e-12 


44.7 




130-194 


898 


LytTR 


LytTr DNA-binding domain 


0.051 


9.5 




201-236 


898 


COX4 


Cytochrome c oxidase subunit IV 


0.61 


4.7 




375-394 


898 


pkinase 


Protein kinase domain 


2.9e- 
102 


349.9 




543-800 


898 


TMP 


TMP repeat 


0.37 


8.0 


1 


766-776 
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Repeats 
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899 


glutaredoxin 


Glutaredoxin 


0 0OS9 


17 1 
12. 1 


i 


101-154 


899 


GST N 


Glutathione S-transferase. N-terminal 


0 0^1 


Q A 

y.4 


1 


102-152 


899 


ArsC 


ArsC family 




4 8 
4.0 


1 


105-131 


899 


GST C 


Glutathione S-transferase, C- terminal 


o ooni i 


1 /.o 


1 


278-370 


899 


UL21 


Herpesvirus UL21 


o qs 


A 1 

U.3 


1 


301-329 


900 


Collagen 


Collagen triole helix reneat (20 com* 


9 d/a AC 


OO 1 

2Z.3 


1 


27-60 


900 


Collagen 


Collagen triple helix repeat (20 copi 


l s<=» cn 

l . JC-U/ 


1A /C 

3U.O 


2 


61-106 


900 


Clq 


Clq domain 


9 Qp-79 


7<A O 


i 


116-241 


900 


TOBE 


TOBE domain 


V. J 


0.3 


1 


207-226 


901 


Herpes BMRF2 


Herpesvirus BMRF2 protein 


0 049 


7 7 
/.Z 


1 


8-26 


902 


BRCT 


BRCA1 C Terminus (BRCT) domain 




11/1 
J 1.4 


i 
1 


10-93 


902 


BRCT 


BRCA1 C Terminus (BRCT> domain 




0/.3 


2 


96-183 


902 


Sec6 


Exocyst complex conmonent Sec6 


0 71 


7 1 
2.3 


1 


367-395 


902 


BRCT 


BRCA1 C Terminus fBRCm domain 


/.oe-io 


01. J 


3 


479-570 


902 


BRCT 


BRCA1 C Terminus fBRCD domain 


< 7 P in 

j . / e- 1 y 


Oj. I 


4 


579-652 


902 


BRCT 


BRCA1 C Terminus CBRCTi domain 


7 Ip. 1 » 
z. oe- 1 o 


f^X A 


c 

5 


***** 

737-823 


902 


RinB 


Transcriptional activator RinB 


0 11 

v.JJ 


J. 4 


i 
1 


796-847 


902 


BRCT 


BRCA1 C Terminus fBRCH domain 


0 098 
v.UZO 


O A 


a 
0 


846-881 


902 


Phage Coat A 


Phage Coat Protein A 


0.82 


3.9 


1 


924-936 


« 903 


BRCT 


BRCA1 C Terminus CBRCH Hnmain 


< Oo AO 

j.ye-uy 


31.4 


l 


10-93 


904 


Phage_X 


Phage X family 


0.71 


4.2 


l 


16-41 


904 


20G-FeH Oxy 


20G-Fefrr) OYVfrpnaQP Qiinprfbmilv 


A 07 
U.2/ 


6.0 


I 


195-273 


905 


LRR 


Leucine Rich Reneat 


n aaai 


lo.o 


1 


4-27 


905 


LRRCT 


Leucine rich reneat P-tprminal Hnmain 


4 1*» 1 1 
4.3e-l3 


A 1 1 

41.2 


1 


37-83 


905 


UPF0118 


Domain of unknown fiinrtinn riTn?9n 


i 
l 


O A 


1 


211-234 


906 


ig 


Immunoglobulin domain 


7 Qq A/C 

/.ye-uo 


24.7 


1 


25-79 


906 


COX17 


Cvtochrome C OX i Hasp ronnpr 

chaperone 


A /?8 
U.Oo 


J.O 


1 


182-195 


907 

! 


TB2 DPI HVA 
22 


TB2/DP1, HVA22 family 


1 Rp» 14 


123.0 


1 


3-96 


907 


ELM2 


ELM2 domain 


0 si 


^ 7 


1 


AA 1 Ajl 

99-124 


908 


LRRNT 


Leucine rich repeat N-terminal domain 


o ooo^s 


1^7 

1 J.Z 


i 
1 


23-49 


908 


LRR 


Leucine Rich Repeat 


O. /C"v/J 


ISO 


1 


C 1 HA 

j 1-74 


908 


Sal vir VRP3 


Salmonella virulence-associated 28kDa 


J 


1 R 


1 
X 


04- oo 


908 


LRR 


Leucine Rich Repeat 


0 00019 


18 4 


2 


*7< AO 


908 


LRR 


Leucine Rich Repeat 


0 0014 




3 


no i oo 

yy-122 


908 


LRR 


Leucine Rich Repeat 


9.9e-06 


22.1 


4 


123-146 


908 


LRRCT 


Leucine rich reneat C-terminal Hnmain 


Z. jc* 1 D 


/IQ 7 
40.2 


1 


156-208 


908 


ig 


Immunoglobulin domain 


1.3e-08 


35.1 


1 


224-283 


908 


ig 


Immunoglobulin Hnmain 


1 C/a AO 

3.oe-uy 


37.1 


2 


320-376 


908 


ig 


Immunoglobulin domain 


0.00083 


17.1 


3 


416-472 


908 


BON 


Tran^norf— acQopiat^H Anm om 


A 1 >t 

U. 14 


7.1 


1 


477-489 


908 


ig 


Immunoglobulin domain 


2.8e-08 


33.9 


4 


533-590 


908 


pec lyase N 


rcvid.it; lyase, in lerminus 


0.19 


3.9 


1 


670-676 


908 


An_peroxidase 


Animal haem peroxidase 


l.le- 
193 


653.6 


1 | 


770- 
1309 


908 


PAL 


Phenylalanine and histidine ammonia-1 


0.53 


2.6 


1 


1037- ! 
1054 


908 


7tm_l 


7 transmembrane receptor (rhodopsin f 


0.22 


2.7 


1 


1101- 
1109 


908 


Peptidase__Cl 


Papain family cysteine protease 


0.76 


2.1 


I 


1194- 
1211 


908 


PetG 


Cytochrome B6-F complex subunit 5 


0.51 


5.7 


1 


1245- 1 
1278 
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— — — ; ■ 

Bacterial protein of unknown function 


0.67 


3.8 


1 


1257- 
1270 


908 


TILa 


'^frr — j : 

1 iLa domain 


0.00018 


16.9 


I 


1438- 
1477 


908 


PSP94 


d e id-microsem inoprotei n { ro r-y 4/ 


0.H 


8.0 


1 


1439- 
1470 


908 


vwc 


vuii vviueorana iacior type u aomain 


2e-10 


38.0 


1 


1439- 
1494 


909 


LRRNT 


1 .ftl lf*l n P noli rPnPOt W—f^fTYJ 0 1 /^/"Vrv* oin 

i^v/utiiic iiuu repeal iN-ieiininai aornain 


0.00068 


15.2 


1 


54-80 


909 


LRU 


Leueinp RirJi T? onpaf 
■t-A/uv^iiic 1x11/11 IvCpCaL 


O AC 

o.7e-05 


18.9 


1 


82-105 


909 


Sal vir VRP3 


uauinjiiciid. viruience-associaieu ZoKLia 


1 


3.8 


1 


95-119 


909 


LRR 


r PltpiflP Rir»li PptipqI 


A AAA 1 O 

0.00012 


18.4 




106-129 


909 


LRR 


Lie u^uic rvjLfi ivepeai 


A Aftl A 

0.0034 


13.5 




130-153 


909 


LRR 


Leucine Rich Repeat 


9.9e-06 


22.1 




154-177 


909 


LRRCT 


Leucine ncn repeat u-terminai aomain 


2.3e-15 


48.2 




187-239 


909 


ig 


Immunoglobulin domain 


1.3e-08 


35.1 


i 


255-314 


909 


it? 


immunogioouiin domain 


3.8e-09 


37.1 




351-407 


909 


ig 


Immunogiobulin domain 


0.00083 


17.1 




447-503 


909 


RON 


Transport-associated domain 


0.14 


7.1 




508-520 


909 


'g 


Immunoglobulin domain 


2.8e-08 


33.9 


i 


564-621 




pec lyase in 


Pectate lyase, N terminus 


0.19 


3.9 




701-707 


909 


An__peroxidase 


Animal haem peroxidase 


Lie- 
193 


653.6 


-j 


801- 
1340 


909 


PAL 


Phenylalanine and hisudine ammonia-1 


0.53 


2.6 


i 


1068- 
1085 


909 


7tm_l 


7 transmembrane receptor (rhodopsin f 


0.22 


2.7 


n — 


1132- 
1140 


909 


Peptidase_Cl 


Papain family cysteine protease 


0.76 


2.1 


i 


1225- 
1242 


909 


PetG 


Cytochrome B6-F complex subunit 5 


0.51 


5.7 


i 


1276- 
1309 


909 


DUF978 


— — — : 

Bacterial protein of unknown function 


0.67 


3.8 


1 


1288- 
1301 


909 


TILa 


'I'll n sJ^N-mntn 

i iL,a aomain 


0.00018 


16.9 




1469- 
1508 


909 


PSP94 | 


Dcid-microseminoprotein ^ror-y4j 


0.11 


8.0 


1 


1470- 
1501 


909 


vwc 


vu " vTnicuiaiiu idLLor type aomain 


2e-10 


38.0 


1 


1470- 
1525 


910 


LRRNT 


Leucine rich rprv»s»t M-fprminol rlnmqtn 


A AAAiTO 


15.2 


1 


23-49 


910 


LRR 


Leucine Rich Rpnpat 


o. /e-lD 


1 O A 

18.9 




51-74 


910 


LRR 


Leucine Rich Repeat 


0.00032 


17.0 




75-98 


910 


LRR 


i_rvsuwuc> iviuii XVCUCaL 


A AO C 

0.025 


10.6 




99-122 


910 


LRR 


T -PiifMrtr* Pioli Donoot 
Jjcuv/iiic fviCil JxcpCdt 


A AAA^A 

0.00069 


15.8 




123-146 


910 


ig 


Immunoglobulin domain 


1.3e-08 


35.1 




201-260 


910 | 


ig 


Immunoglobulin domain 


3.8e-09 


37 1 






910 


ig 


Immunoglobulin domain 


0.00083 


17.1 




393-449 


910 


BON 


Transport-associated domain 


0.14 


7.1 




454-466 


910 


ig 


Immunoglobulin domain 


0.47 


6.8 




514-532 


910 


An_peroxidase 


Animal haem peroxidase 


l.le- 
193 


653.6 




663- 
1202 


910 


PAL 


Phenylalanine and histidine ammonia-l 


0.53 


2.6 




930-947 


1 910 


7tm_l 


7 transmembrane receptor (rhodopsin f 


0.22 


2.7 




994- 
1002 
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910 


Peptidase_Cl 


Papain family cysteine protease 


0.76 


2.1 





1087- 
1104 


910 


PetG 


Cytochrome B6-F complex subunit 5 


0.51 


5.7 


■i — - 


1138- 
1171 


y IU 


UUry/o 


Bacterial protein of unknown function 


0.67 


3.8 


I 


1150- 
1163 


y iu 


l iba 


TILa domain 


0.00018 


16.9 


1 


1331- 
1370 


y i\j 




" ; ; ; 

Beta-microseminoprotein (PSP-94) 


0.11 


8.0 


1 


1332- 
1363 


9in 

y lkj 




von willebrana tactor type C domain 


2e-10 


38.0 




1332- 
1387 


911 


EGF 


nvjr-iiKe domain 


0.059 


9.8 


2 


47-59 


911 


EGF 


EGF-like domain 


0.0036 


14.2 


3 


85-99 


911 


FGF 


jDor-uice domain 


4.9e-08 


31.7 


4 


106-134 


911 


FGF 


EGF-like domain 


4.2e-10 


39.2 


5 


172-203 


911 


EGF 


EGF-Iike domain 


0.00083 


16.5 


6 


210-245 


91 1 


ldminin cor 


Laminin EGF-like (Domains III and V) 


0.014 


10.8 


3 


216-247 


y l i 


laiiiinin vj 


Laminin G domain 


0.0021 


12.5 


1 


275-335 


91 1 


lommtn /^l 

ldiniiiin vj 


Laminin G domain 


0.018 


9.3 


2 


386-401 


91 1 

y l i 




Protein of unknown function, DUF604 


0,84 


2.9 


1 


390-412 


91 1 


I Q mi fl 1 T"l 

idjniuin vj 


Laminin G domain 


0.22 


5.5 


3 


483-541 


91 1 


FOF 


livjr-iiKe domain 


9.9e-ll 


41.5 


7 


574-605 


91 1 


FOF 


Dvjr-iiKe domain 


0.43 


6.7 


8 


611-632 


91 1 




Protein of unknown function 

fTW IFin£T\ 


0.79 


3.0 


1 


614-628 


911 


•Inminin fr 

IcUlllllIll VJ 


Laminin G domain 


1.9e-05 


19.6 


4 


663-728 


911 


N/fplihmcf 1 
lviciluiaoC 


Melibiase 


0.9 


2.3 


1 


740-755 


911 


laminin O 

ICUl 11 11111 VJ 


jLamimn kj domain 


0.075 


7.2 


5 


773-788 


911 


FGF 


nvjr-iiKe domain 


2.2e-09 


36.6 


9 


823-854 


911 




Delta serrate ligand 


0.44 


4.8 


2 


844-854 


911 


EGF 


C/Vjr-iiKe aomain 


6.4e-06 


24.1 


10 


861-892 


911 


EGF 


E'Vjr-iiKo domain 


0,71 


5.9 


11 


901-933 


911 


DSL 


Delta serrate ligand 


0.67 


4.2 


4 


923-933 


911 


EGF 


cvjr-iiKe oomain 


3e-06 


25.3 


12 


940-971 


913 


Dm P era - a tra p rit ay 


wmega-atracotoxin 


0.43 


3.7 


1 


24-44 


913 


M 


M protein repeat 


0.28 


8.8 


1 


146-166 


913 




Uncharacterised protein family (UPF01 


0.04 


7.4 


1 


322-347 


914 

y i i 


RTfa 


Regulatory subunit of type II PKA R-s 


le-14 


54.8 


1 


25-62 


914 


SURF6 


Surfeit locus protein 6 


0.027 


7.2 


1 


42-113 


914 


Lin ivLr_oinairig 


Cyclic nucleotide-binding domain 


7.2e-31 


112.5 


1 


152-240 


914 


ft XT A rw-vl D»^kO 

KIN A PQl KpDz 

4 


RNA polymerase Rpb2, domain 4 


0.28 


6.2 


1 


184-191 


914 
y i*f 


ciN Mr pi naing 


Cyclic nucleotide-bindmg domain 


9.4e-32 


115.7 


2 


270-364 


914 


jviemyirranst_ 1 


y" /"V — .xt— .1 • T*W T A 

o-O-methylguanme DNA 
methyltransfera 


0.64 


4.3 


1 


325-337 


915 


DIL 


DIL domain 


1.8e-40 


144.6 


1 


214-323 


915 


PD2 


PDZ domain (Also known as DHR or 
GLGF 


1.7e-14 


52.8 


1 


555-639 


916 


PLAT 


PLAT/LH2 domain 


9.8e-32 


109.3 


1 


2-111 


916 


lipoxygenase 


Lipoxygenase 


3.9e- 
194 


655.1 


1 


121-647 


916 


DUF181 


Uncharacterized ACR, COG 1944 


0.81 


2.4 


1 


247-258 


916 


PG binding I 


Putative peptidoglycan binding domain 


0.5 


5.6 


1 


420-436 
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Position 


916 


Dus 


DihvdroiiriHinp ^vnHia«if* fFiiio^ 


n 1 8 

U. 15 


4.0 


1 


604-647 


917 


PLAT 


PLAT/LH2 domain 


Q So 10 

y.oeoz 


IU9.3 


- 1 


2-111 


917 


lipoxygenase 


LlDOxvP'pna<;p 


1 0*» A% 

i.ye-*fo 


104.1 




112-293 


917 


DUF181 


Uncharacterized ACR, COG 1 944 


0.81 


2.4 


■ i 


220-231 


918 


PLAT 


PLAT/LH2 domain 


O Oo io 

y.oeoz 


1 Art T 

109.3 




2-111 


918 


lipoxygenase 


Lipoxygenase 


2.7e-57 


194.3 


- 1 


121-322 


918 


DUF181 


wncnaractenzea ai^k, CUUiy44 


0.81 


2.4 


1 


249-260 


920 


TFIIS 


irdnscripuon factor o-ll (lrllo) 


1 


4.6 


1 


5-15 


920 


DUF536 


rruiem or unKnown iunction, UUrjjo 


0.19 


7.9 




214-251 


920 


FCH 


Fes/CIP4 homology domain 


0.5 


5.6 


• i 


259-278 


926 


DSI 


Deoxyhypusine synthase 


0.53 


2.5 


1 


21-36 


926 


onjur j 


SH3 domain-binding protein 5 


0.097 


6.5 


I 


82-102 


926 




Transmembrane amino acid transporter 


3,5e- 
139 


472.5 


1 


114-517 


926 


Heroes U47 


ncrpcs virus glycoprotein U4 / 


0.69 


1.1 


1 


141-158 


926 


Ome fia-atracot ax 




0.35 


4.0 


1 


168-184 


926 


DUF588 


l^uiiidin oi unKnown runcnon 

V > j-*vji joo j 


A CO 

0.58 


5.1 


1 


425-444 


926 


GSPU F 


uovituoi *-ypc ii Dci/icuun system pr 


A /1< 
U.40 


3.6 


1 


438-455 


926 


FtsX 


PrPnl rtpf! nprmi^Ji cf» 


A 1<T 


5.4 


1 


454-523 


927 


EGF 


EGF-lilfP Horn n in 
awji 1 1 ivt> uuiiiaiii 


O AO/f 


11.2 


1 


42-57 


927 


EGF 


EGF-Iikp domain 


i.je-uo 


Zo.o 


2 


60-88 


927 


EGF 


BGF-like domain 


1 Oo AO 

l .ze-uy 


i/.5 


3 


95-128 


927 


Cripto 


Crioto erowth factor 


A 

U.oO 


1 A 

3.4 


1 


101-132 


927 


laminin EGF 


1/5 32 60 2 41 


a ao^ 


O A 


Z 


106-130 


927 


EGF 


EGF-like domain 


c <:« a*7 


on a 
Z/.y 


4 


135-171 


927 


EGF 


EGF-like domain 


1*» 1A 


41.4 


5 


178-209 


927 


EB 


EH modnlp 


A OA 

u.zo 


5.4 


1 


1 83-209 


927 


EGF 


EOF-like domain 


oe-Uo 


31.7 


6 


216-247 


927 


DUF990 


Protein of unknown fimrfinn fDTTT : 70QA^ 


U.Z^ 


5.3 


1 


302-336 


927 


MARVEL 


Memhranp-a t ? , ?or*iafincr HnmnJn 


A 1 ^ 


c o 

J.O 




305-333 


927 


PAP2 


PAP2 sunerfarnil v 


A QQ 
U.OO 


J. / 


1 


311-334 


927 


Colicin V 


Colicin V nrodnprinn nrnfpi'n 


A OQ 


3.5 


1 


315-336 


928 


Ornatin 


Ornatin ^ 


A 


A 1 

4./ 


1 


125-132 


928 


PP1 inhibitor 


PICC-activatpd nrotpin nlincnVnifac#» 1 i 


A HQ 

U. /o 


Z.Z 


1 


423-439 


929 


ank 


Ankvrin reneat 


A A1 1 


lz.7 




1 A i i /n 

142-167 


930 


LRRNT 


Leucine rich reDeat N-terminnl domain 


A TO 


A 

o.U 


— 

1 


o^ 

66-86 


930 


DUF6 


Integral memhranp nrnfptn HI TWA 




ion 




86-129 


930 


DUF6 


Integral membrane protein DUF6 


7e-05 


20.9 




180-277 


931 


endotoxin 


dplta pndntrtvin 


A oc 
U.65 


2.3 




203-220 


932 


Lipoprotein 8 


Hypothetical lipoprotein (MG045 famil 


0.7 


1.1 


i 


65-79 


933 


Peotidase M24 


meiaiiopepnaase iamuy jvlz4 


2.2e-70 


244.0 


1 


88-326 


933 


DUF120 


jjomain oi unicnown runcnon UUr 120 


0.089 


7.1 


1 


169-180 


934 


NenrPYonhiHn 

A * « Ul V*AAJ pi 111 in 


in eurexopnum 


2e-258 


804.9 




3-308 


934 


NnrS 


NnrS protein 


0.47 


3.0 


1 




938 


L27 


L27 domain 


7.3e-19 


69.4 




13-68 


938 


Not3 


Notl N-terminal domain, CCR4-Not 
comp 


0.95 


2.9 




54-77 


938 


PDZ 


PDZ domain (Also known as DHR or 
GLGF 


8.1e-22 


78.5 




93-172 


938 


CDC50 1 


LEM3 (ligand-effect modulator 3) fami 


1 


2.1 




159-174 


938 


DUF100 


Protein of unknown function DUF100 


0.2 


4.1 




175-188 


939 1 


DIE2 ALG10 


DIE2/ALG10 family 


7.6e-72 


248.9 




28-146 
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Repeats 


Position 


939 


DUF718 


Protein of unknown function fDtrF71 81 


VJ.04 


A A 

4.4 


1 


36-43 


939 


Gemini_mov 


Gemini virus outative movement 
protein 


ft 49 


A £L 
4.0 


i 
1 


101-115 


940 


rrm 


RNA recognition motif fa k a RRM R 


i no 




1 


61-128 


940 


RbsD FucU 


RbsD / FucU tran snort nrotein familv 

*■ ^ * * uvw utuupui I yJL \J Lwl X I 1. dill 11 y 


ft si 




1 


123-147 


940 


HemX 


HemX 


ft 17 




i 
l 


142- I 73 


940 


rrm 


RNA recognition motif faka RRM R 


4£ A 11 

H-.oe-ij 


4o.O 


I 


186-253 


940 


rrm 


RNA recognition motif faka RRM R 


4 it* 1 7 


/1C H 

*fo. / 


3 


339-406 


940 


rrm 


RNA recognition motif fa lea RRM R 


1 4p-A£ 
1 .*tC-UO 




4 


ACC CIA 

456-524 


941 


CjripleX 


Cysteine rich reoeat 


9f» AS 


17 o 


1 


60-77 


941 


Bowman- 
Birk leg 

=m 2 


Bowman-Birk serine nrotease inhfhitnr 


1 
L 




i 
1 


69-84 


941 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0 39 


^ 1 
O. I 


i 
1 


Q1 A/1 

oi-y4 


941 


EGF 


EGF-like domain 


O. /c-UO 


ZJ.O 


Z 


yy-izy 


941 


TIL 


Trypsin Inhibitor like cysteine rich 


0.0035 


11.0 


I 


118-139 


941 


EGF 


EGF-like domain 

IJVJI 11IVV VI Will dill 


7 as 


OA 1 

zU.z 


3 


139-173 


941 


TIL 


Trvosin Inhibitor lilfp rv^einf* r\c\\ 

*• 1 JfJ^OIil XlUlllSlLUl IIJVV Vj^OtClilV 1 11/11 


n 7A 

V/.ZO 


J.l 


2 


152-179 


941 


toxin 5 


Scornion short 1 tovin 

Uwvl l/tvil OUV/ll i-V/ A. Ill 


ft 14 


A A 
4.4 


1 


154-159 


941 


EGF 


EGF-like domain 




71 1 


A 

4 


1 *70 111 

179-212 


941 


EGF 


EGF-like domain 




1A *5 


< 


224-259 


941 


MAM 


MAM domain 




147 A 


i 
1 


vfAl C AH 

4U3-547 


942 


CjripleX 


Cysteine rich repeat 




1 7 8 


I 


05-oz 


942 


Bowman- 
Birkjeg 


Bowman-Birk serine nrotease infiihitnr 


1 

1 


il A 
4.U 


1 


*7/l CO 


942 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0 32 


f% 1 


i 
i 


oo-yy 


942 


EGF 


EGF-like domain 


o. /e— vvl 


91 fi 


9 


1 f\A IIO 


942 


TIL 


Trypsin Inhibitor like cysteine rich 


ft 0035 


1 1 A 


1 

1 


191 1/1/1 
1ZJ-144 


942 


EGF 


EGF-like domain 


7 5e-05 


9ft 9 




\AA 17C 
144-1 /o 


942 


TIL 


Trypsin Inhibitor like cysteine rich 


0.26 


< 1 


9 
z 


1ST 1 Q/I 


942 


toxin 5 


Scorpion short toxin 


0 34 


44 


1 


1 SO 1 6A 


942 


EGF 


EGF-like domain 


4 4e-ft*5 


91 1 




184-Zi / 


942 


EGF 


EGF-like domain 


9.7e-09 


34 3 




99Q 9/^/1 
ZZ5/-Z04 


942 


MAM 


MAM domain 


3,5e-41 


147.0 


1 

1 


*tUO-J JZ 


943 


PHD 


PHD-finger 


3.4e-14 


45.7 


1 


85-128 


943 


bromodomain 


Bromodomain 


5 4e-19 


A A A 


1 
1 


1i4Q 91< 


943 


PHD 


PHD-finger 


0.61 


3 Q 


9 


9/CA 979 
ZOU-Z /Z 


943 


PWWP 


PWWP domain 


6 3e-10 


36 9 


1 


9^0 119 


943 


GatB 


PET1 12 familv C terminal region 


0 64 


s 1 

J* 1 


1 
i 


9QC ini 


943 


TH1 


TH1 protein 




A 9 


1 


04U-OJJ 


943 


SP2 


Structural nrotein 2 


0 42 


1 1 

1. 1 


1 

L 


Ofl/1 099 

yu4-y zz 


943 


zf-B box 


B-box zinc finger 


0 17 


Q 1 


1 


07/1 QOQ 


943 


zf-MYND 


MYND fincer 

4- T *- *> 1 ™ X—* llilgvl 


JJ6- 1 1 


IS 7 


1 
1 


1 A1 1 


944 


PHD 


PHD-finger 


3 4p-14 


4S 7 


1 


oj-lzo 


944 


bromodomain 


Bromodomain 


5 4e-12 

J.*tC 1Z> 


44 0 


1 


1AO 1W 

i4y-Zjj 


944 


PHD 


PHD-finger 


0.61 


3.9 


2 


260-272 


944 


PWWP 


PWWP domain 


6.3e-10 


36.2 


1 


269-312 


944 


GatB 


PET1 12 family, C terminal region 


0.64 


5.1 


1 


288-303 


944 


TH1 


TH1 protein 


0.91 


0.2 


1 


640-653 


945 


PHD 


PHD-finger 1 


3.4e-14 


45.7 


1 


85-128 


945 


bromodomain 


Bromodomain 


5.4e-12 


44.0 


1 


149-235 


945 


PHD 


PHD-finger 


0.61 


3.9 


2 


260-272 


945 


PWWP 


PWWP domain 


6.3e-10 


36.2 


1 


269-312 


945 


GatB 


PET112 family, C terminal region 


0.64 


5.1 


1 


288-303 
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jjescnpuon 


E^value 


Score 


Repeats 


Position 


945 


TH1 


1111 piULdli 


A Ol 

u.y i 


A O 

0.2 


1 


640-653 


945 


SP2 


Striipturui nrfifp»in 9 
v?U UV/lULdt JJIVJICU1 z. 


a /io 
U.42 


l.l 


1 


950-968 


945 


zf-B box 


JJ-UUA £.111** tilled 


a i o 


A 1 

9.1 


1 


1020- 
1035 


945 


zf-MYND 


MYWD fin per 


joe-i i 


35.7 


1 
1 


1023- 
1057 


946 


PHD 


PHD-fineer 


% An 1/1 


AC 1 

40. / 


1 
i 


90-133 


946 


bromodomain 


Bromodomain 


5.4e-12 


44.0 


1 


154-240 


946 


PHD 




U.Ol 


3.9 


2 


265-277 


946 


PWWP 


PWWP Hrtmnin 

i w yy l uuniain 


o.3e-10 


36.2 


1 


274-317 


946 


GatB 


iTLti i iz, ictiiiiiy, \s iciiuinai region 


U.o4 


5.1 


1 


293-308 


946 


TH1 


i in i piuiwiii 


A A1 

u.yi 


A O 

0.2 


1 


645-658 


946 


SP2 


F\tri irtnrnl nrrvfp»iri 9 


A /ll 

U.42 


1.1 


1 


955-973 


946 


zf-B box 


R->aay vitk*' fincTPr 
i/ uua ^liiu uugcr 


A 1 O 

U.12 


A 1 

9.1 


1 


1025- 
1040 


946 


zf-MYND 


MYND finger 


5.3e-ll 


35.7 


1 


1028- 
1062 


947 


Urotensin II 


Urotensin TI 


U.JO 


J A 


1 
1 


362-372 


947 


m2 


Fibronectin tvnp TT Anmnin 


A <C< 


0.5 


1 

1 


363-371 


950 


Terminase 5 


Putative ATPase subunit of terminase 


0.87 


0.7 


1 


7-20 I 


950 


ion trans 


Ton truncriArt r\rrvt£»in 


1 0^ AO 


29.8 


1 


345-518 


950 


SirB 


Invasion gene expression up-regulator 


0,2 


6.0 


1 


350-366 


950 


Pent CI -like 


r cpuuase v>i-iiKe iamiiy 


0.88 


1.2 


1 


549-569 


950 


BK_channel_a 


Calcium-activated BK potassium 

vllolllic 


5.1e-07 


22.5 


1 


598-702 


950 


zf-CHC2 


v-'i lv^Zr ZiiiiL- ungci 


O./O 


4.9 


1 


739-769 


950 


Aloha adanttn P 


svipnd auapun Arz, L,-ierminai domain 


A O 1 

0.31 


3.1 


1 


894-900 


950 


CPSase L D3 


v^di udiiiuyi-piiospnaLe syntneiase large 


A TO 

0.12 


1.1 


1 


1086- 
1098 


950 


BK channel a 


v-zoiv/ium dV/UValCU IjIV pOloSSlUm 

channe 


A AOQ 


5.8 


2 


1132- 

11/1 


951 


Pep_M12B_prop 
ep 


ReDrolvsin familv nrAnpntiHp 


j.zeo / 


1 1 A 5 


1 


OA 1 AO 


951 


Reprolysin 


Reorolvsin fM12B^ familvyinr metalln 


1, 1C3-00 


^AA fi 


1 


0 1 A A AA 

ZlU-409 


951 


Fragilysin 


Fracilvsin metallonentidase (Ml OP^ pn 






1 


QXO ore 


951 


Peptidase M46 


Preffnancv-associated nlasma nrntf*in-A 


v.V/JO 


J.J 


1 
1 


ICC 

34j-3jj 


951 


disintegrin 


Disintegrin 


l- /C-J7 




1 
i 


42o-j01 


951 


EGF 


EGF-like domain 




J.*r 


1 
1 


^"51 fiCA 

0j1-0j4 


953 


ank 


Ankyrin repeat 




94 0 


1 
1 


ici i nr\ 


953 


ank 


Ankyrin repeat 


Op-AO 


^ A 
JJ.U 


A. 


Io3-21j 


953 


ank 


Ankyrin repeat 


V. 1 J 


0.0 


•3 

J 


21o-24o 


953 


ank 


Ankyrin repeat 


0 7p-10 


ic A 


A 


1C(\ 001 
2jU-2o2 


953 


ank 


Ankyrin repeat 


0.00014 


19.5 


5 


283-328 


953 


LolA 


Ol if P.f mftmhrJltlP llAAnrAtAin nckrrinr r\r 

wui w iii^iiit/icuio iipLFjJi uiciii u airier pr 


1 
1 


1 A 


1 


O Art 0 ^ 

309-332 


953 


ank 


Ankyrin repeat 


3.8e-08 


32.3 


6 


329-361 


953 


ank 


Ankyrin repeat 


0.49 


6.8 


7 


362-394 


954 


interferon 


Interferon alphafteta domain ! 


7.5e-42 


144.5 




16-105 


955 


ShTK 


ShTK domain 


0.46 


4.9 




67-74 


955 


NADHdli 


NADH dehydrogenase 


0.84 


3.4 




123-142 


956 


adhshort 


short chain dehydrogenase 


7.6e-27 


92.5 




31-137 


956 


sodcu 


Copper/zinc superoxide dismutase 
(SOD 


0.059 


5.9 




70-87 


956 


Pexl4^N 


Peroxisomal membrane anchor protein 


0.21 


5.0 


1 


95-105 
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Score 
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Position 


956 


CitF 


v^iuaic iydse, dipiid bUDunji i v^iir \ 


A 00 


1 *7 
1./ 


i 
1 


A 111 

124-133 


956 


adh short 


short chain dehydrogenase 


6.2e-ll 


37.7 


2 


138-188 


057 


U11U1 cu 


Thioredoxin 


i 


A A 

4.4 


1 


69-96 


0S7 


Fvrl Air 


Pfirl / All" fomilir 

crvi / Air iamny 


D.ie-i / 


iCO c 
OZ.J 




354-441 


957 


GAF 


GAF domain 


0.48 


5.6 


i 


380-401 


Q57 


irilo 


transcription tactor a-11 ( IMIb) 


0.14 


7.6 


1 


394-406 


058 

-'JO 


aciu_puobpnai 


Histidine acid phosphatase 


o.4e-3o 


125.6 


1 


32-179 






Histidine acid phosphatase 


o.De-z4 


83.0 




205-381 


958 


NicO 


High-affinity nickel-transport protei 


0.99 


2.9 


i 


398-416 




serpin 


Serpin (serine protease inhibitor) 


8.5e- 
197 


663.9 


1 


1-329 


QAA 

you 


serpin 


Serpin (serine protease inhibitor) 


6.8e-87 


295.8 


1 


45-191 


Q£A 


serpin 


Serpin (serine protease inhibitor) 


1.7e- 
116 


396.7 




192-397 


yo i 


serpin 


— — ; - ■ ; ■ 

Serpin (serine protease inhibitor) 


1.6e-63 


216.1 


1 


45-158 




serpin 


Serpin (serine protease inhibitor) 


5e-139 


472.0 




159-397 


962 


serpin 


Serpin (serine protease inhibitor) 


4.5e- 
151 


512.0 


i 


45-300 


06? 


ivioiyaop_Dinain 
g 


: — T- 1 Tj — 1_- _r 

Molydoptenn dinucleotide binding 

dom 




4.1 


1 


289-309 


962 


epmi n 
oCl pi 11 


Serpin (serine protease inhibitor) 


A Oa, ^A 


1 AA C 

iyo.5 




301-397 


963 


OnrR 


^aruunyuxatc-seiecnve ponn, uprt> ia 




O.J 


-j 


16-33 


963 


AJHina<jp f 1 


/\iinid5c } v»-Lci iiuiidi uumdin 


U.OJ 


A 1 
4.1 




A C CO 

45-58 


963 


Adeno F1 A 


cai Ly r> i/\ pruiein 


\J.DD 


O A 


-J 

-1 


23 /-z j 1 


064 


r vp ivi i ad pi up 

en 
cp 


tvcproiysin iamuy propeptiae 


/.4e-4 / 


148.0 




1 12-220 


964 


Reprolysin 


Reprolysin (M12B) family zinc metallo 


1.9e-96 


330.6 


i 


232-426 




rVoLadll 


Astacin (Peptidase family M12A) 


A O 1 

U.zl 


C A 

5.0 


-i 


*\ /* f ton 

366-380 


064 


Phi 1 
i in i 


Phosphate- induced protein 1 conserved 


A ^ 1 


O 1 

3.3 


pi 


414-426 


964 


n i ci n crri n 
UlollllCgi 111 


111 ci nfAon « 

uisimegnn 


j.oe-zJ 


oo < 


1 


aaa cn 

444-517 


064 


CRM in 


Cellulose or protein binding domain 


A /1*7 
U.4/ 


o 
0.0 


J 


481-499 


964 


EGF 


FirF-lllff* ni~\m am 

xivjr-invc uumdin 


u./l 


O Q 
/.O 




664-693 


965 


Uteroglobin 


T FtprftoloKtti familu 
wiciugitjuiii icumiy 


a no 
o.oe-uy 


00 Q 


-y 

-j 


1 OQ 


966 


7tm 2 


/ uaiioiiiciiiL/idiic receptor ^oecreun ia 


A OA 


O /C 
2.0 


-J 


IA TO 

19-38 


966 




yjuf\ itv^ujy ^nucieosiae pnospnaiasej 
fa 


z.ze-y^ 


0 1 < A 

30.4 




AO A Ol 

48-483 


966 


El 


Panillomavim^ hplionQp 


A ^6 


A ^ 

*T.J 


-j 


/o-yz 


966 


PLRV 0RF5 


P At at A Ipisf mil vimc Tt*nAtl\rfxtta\\ nr 
*. ulcuu lcoi lull v n lib ICaUlIlIUUgu pr 


A 79 


1 A 

1.0 


— 


143-161 


966 


Nicastrin 

illvadUUl 




A 6^ 


l.O 




-J 


1 >1/C 1 01 

146-1 /l 


966 


DUF462 


iriuiciii ui uiiivriuwii iunciion, ukjvho/. 


A 


A 7 
4./ 




OOI OAA 

3 / i-3yo 


966 


Adeno F3R 


/\ucuuviru9 &JD protein 


A 7 


J.O 


~ 

-j 


/IQC CAO 


967 


Clq 


^IrtdrtnriQiTi 

Vjrli| UUIlldlll 




100. 1 




OO OAO 


968 


Omflrin 


OrnaHn 
V/llldUll 


A 


A Q 
4.0 


-J 

-j 


AA 1 A/T 

yy- IU6 


969 


Omattn 


C^\m afi n 
vylllaLlli 


A 


A Q 
4.o 




134-141 


969 


Spo7 


Spo7-iike protein 


1 


1.5 




405-417 


969 


MARVEL 


Membrane-associating domain 


0.37 


4.5 




487-596 


969 


DUF202 


Domain of unknown function DUF 


0.23 


5.7 




493-518 


970 


ig 


Immunoglobulin domain 


0.0038 


14.6 




41-124 


970 


ig 


Immunoglobulin domain 


0.00023 


19.2 




163-230 


970 


Gag p30 


Gag P30 core shell protein 


3.6e-08 


28.0 




452-491 


970 


zf-CCHC 


Zinc knuckle 


8.8e-07 


27.8 




523-540 


971 


Prefoldin 


Prefoldin subunit 


0.66 


5.0 




179-206 


971 


Seryl_tRNA_N 


Seryl-tRNA synthetase N-terminal 
doma 


0.92 


5.7 




179-196 
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ID 
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V*tIllC 


Tc 

Score 


penis 


r os in on 


971 


Adeno PIX 


Adenovirus hexon-associated protein ( 


0.12 


6 6 


i 




971 


pentaxin 


Pentaxin family 


2.3e-26 


91.1 


1 


302-464 


971 


Avirulence 


Xanthomonas avirulence protein, Avr/P 


0.07 




1 




972 


ArsA ATPase 


Anion-transporting ATPase 


0 87 


9 & 


1 


jy-Oy 


972 


TSPN 


Thrombospondin N-terminal -like 
domai 


0.88 


9 7 


1 


111 


972 


RHS_repeat 


RHS Repeat 


0 00085 


15.6 


9 


zjy-zoo 


972 


RHS repeat 


RHS Repeat 


6.6e-05 


10 5 


4 


i i AJ\fn 


973 


bZIP 


bZIP transcription factor 


0 00024 


17 2 


i 
i 


69^-fiRfi 
OjiJ-OOO 


973 


integrase_DNA 


DNA binding domain of tn916 
integrase 


0.38 


6.3 


1 




973 


CarD TRCF 


CarD-like/TRCF domain 


0.54 


4.5 


1 


70K-798 

/ UO* / x>0 


974 


WD40 


WD domain, G-beta repeat 


0.05 


9.9 


1 


2-27 


974 


DUF596 


Protein of unknown function, DUF596 


0.84 


3.7 


1 


Uj"/U 


974 


WD40 


WD domain, G-beta repeat 


0.29 


7.2 


3 


76-1 09 
/ tr* i\jy 


974 


denso VP4 


Capsid protein VP4 


0.81 


1.5 




JJJ Jlrr 


974 


TPR 


TPR Domain 


0.1 


9.1 


i 

L 


749-767 


974 


Paramyxo_C 


Paramyxovirus non-structural protein 


0.74 


2.8 


I 


784-SOfl 


974 


Xylose isom 


Xylose isomerase 


0.4 


3.2 


I 


796-81 1 

/ 7v"0 I 1 


974 


TPR 


TPR Domain 


0.083 


9.4 


2 


969-990 


974 


U-box 


U-box domain 


0.036 


6.5 




1994- 

1308 


975 


cofilin ADF 


Cofilin/tropomyosin-type actin-bindin 


0.97 


4.0 


1 


6-18 


975 


PhageCH 


Bacteriophage CII protein 


1 


3.9 


1 


229-243 


975 


ion trans 


Ion transport protein 


0.0048 


11.5 


1 


247-408 


975 


Sarcolipin 


Sarcolipin 


0.56 


5.3 


1 


362-390 


976 


cofilin_ADF 


Cofilin/tropomyosin-type actin-bindin 


0.97 


4.0 


1 


6-18 


976 


Phage^CH 


Bacteriophage CII protein 


1 


3.9 


1 


303-317 


976 


ion trans 


Ion transport protein 


0.0048 


11.5 


1 


321-482 


976 


Sarcolipin 


Sarcolipin 


0.56 


5.3 


1 


436-464 


977 


zf-C2H2 


Zinc finger, C2H2 type 


0.083 


11.8 


1 


4-27 


977 


zf-C2H2 


Zinc finger, C2H2 type 


0.00081 


19.9 


2 


108-131 


977 


zf-C2H2 


Zinc finger, C2H2 type 


0.07 


12.1 


3 


162-1 8S 


977 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.45 


3.1 


1 


238-248 


977 


zf-C2H2 


Zinc finger, C2H2 type 


0.28 


9.7 


5 


439-469 


977 


zf-C2H2 


Zinc finger, C2H2 type 


0.0026 


17.9 


7 


600-623 


977 


zf-C2H2 


Zinc finger, C2H2 type 


0.047 


12 8 


Q 

y 




977 


zf-C2H2 


Zinc finger, C2H2 type 


0.66 


8.2 


u 


1030- 

1 UJJ 


977 


zf-C2H2 


Zinc finger, C2H2 type 


0.025 


13.9 


14 


1265- 

19RR 
IZOo 


977 


adenojfiber 


Adenoviral fibre protein (knob domain 


0.076 


3.5 


1 


1349- 

IJJ / 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.023 


14.1 


17 


1470- 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.031 


13.5 


19 


1577- 
1600 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.022 


14.1 


20 


1660- 
1683 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.0044 


16.9 


23 


1892- 
1914 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.41 


9.0 


24 


1968- 
1990 


977 


DC1 


DC1 domain 


0.68 


4.3 


2 


2049- 
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Description 


E_vaiue 


Score 


Repeats 


Position 














2004 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.0039 


17.2 


25 


2051- 

2073 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.0014 


18.9 


26 


2085- 
210/ 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.0094 


15.6 


27 


2114- 
213/ 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.041 


13.0 


28 


2143- 

2100 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.033 


13.4 


30 


2280- 
2303 


977 


TFIID-31 


Transcription initiation factor IID, 


0.28 


5.7 


1 


2300- 
2310 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0.14 


10.9 


31 


2314- 

2330 


977 


zf-C2H2 


16/34 1369 1392.. 1 24 


0,0018 


18.6 


32 


2360- 

z3oZ 


y f l 




1£/1/1 11£Q 11GO 1 OA 

10/34 Hoy 13VZ .. 1 z4 


O Ol A 


1/1 7 


33 


Z3oo- 
941 1 


Q77 

y 1 1 


ill SlOnc_nLN l3 


ri-iNo ni stone lamny 


n 


A 7 

Hi 1 


1 
1 


9414 


911 


zf-C2H2 


16/34 1369 1392.. 1 24 


3.6e-05 


25.4 


34 


2474- 
94Q6 


977 


PdxA 


Pyridoxal phosphate biosynthetic prot 


0.41 


4.2 


1 


2540- 


QSft 

you 


ivjrDr 


insuiin-UKe grow in i actor oinuing pr 


o ni7 


1ft ft 


i 
i 




yo\) 


kazal 


ivaZiii-iype serine proicdbc lniuoiiur 


0 1^-07 

. JC'u / 


90 4 


i 
i 


71-1 17 


yo\j 


irypbin 


lrypsm 


4 9<»-94 


74 *\ 


i 
i 


167.196 


Q8ft 


r 


l^aivl UOIIld.111 




7 4 


i 
i 


1 86-20° 


QSft 


UUr / / 1 


uoniain or unitnown luncuon 

II*.} 


0 91 


S 9 


i 

i 


^07-199 


980 


PDZ 


PDZ domain ( Also known as DMR or 

X t-J 1-4 110.111 ^/xl C3 \J AJ1UW1I OO IS 1. J-l-V V/i 

GLGF 


7.1e-14 


50.6 


1 


332-427 


981 


asp 


Eukarvotic asoartvl nrotease 


6.6e-35 


123.8 


1 


19-112 


981 


trans reg C 


TranscriDtional retaliatory nrotein. C 


0.019 


11.1 


1 


27-55 


981 


asp 


Eukarvotic asDartvl nrotease 


1.8e-23 


83.1 


2 


165-239 


981 


asp 


Eukarvotic asoartvl nrotease 


0.0003 


14.7 


3 


240-268 


981 


asp 


Eukarvotic asnartvl nrotease 


1.7e-48 


171.3 


4 


295-421 


984 


7,n carhOnent 


^inc carhoxvnenfidase 


1.2e-76 


259.4 


1 


48-249 


984 


APC basic 


APC basic domain 


0.53 


2.7 


1 


279-292 


985 


Zn carbOnent 


Zinc carboYvnentidase 


1.2e-76 


259.4 


1 


48-249 


985 


APC basic 


APC basic domain 


0.53 


2.7 


1 


279-292 


986 


NifU N 


NifT J-like N terminal domain 


1.7e-80 


277.6 


1 


34-160 


987 


SNF7 


SNF7 


6.6e-65 


225.8 


1 


108-277 


987 


Glvco trail 28 C 


Glvcosvl transferase familv 28 C-termi 


0.71 


3.8 


1 


171-201 


988 


Rzl 


Lipoprotein Rzl precursor 


0.92 


4.2 


1 


1-35 


988 


UPAR LY6 


u-PAR/Ly-6 domain 


6.4e-06 


29.8 


1 


28-110 


990 


zf-C2H2 


Zinc finger, C2H2 type 


0.00035 


21.4 


1 


53-78 


990 


zf-C2H2 


Zinc finger, C2H2 type 


0.012 


15.2 


2 


87-114 


990 


zf-C2H2 


Zinc finger, C2H2 type 


0.0039 


17.1 


3 


120-144 


991 


pkinase 


Protein kinase domain 


3.2e-90 


309.9 


1 


20-312 


991 


Glyco hydro_15 


Glycosyl hydrolases family 15 


0.18 


4.4 


1 


472-522 


992 


Prefoldin 


Prefoldin subunit 


0.12 


7.6 


1 


5-44 


992 


spectrin 


Spectrin repeat 


0.00067 


15.0 


1 


59-121 
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K value 

WJ ▼ M1UV 


Score 


Ren eats 


Position 


992 


spectrin 


: — 

Spectrin repeat , 


5.4e-06 


22.2 


2 


124-226 


992 


DUF16 


Protein ot un Known iuncuuu iu 


0.67 


3.9 




202-250 


992 


spectrin 


Spectrin repeat 


5.2e-07 


25.7 




229-340 


992 


GSPIl E N 


Oorll b N -terminal aonwm 


0.07 


7.7 




265-290 


992 


spectrin 


Spectrin repeat , 


2.8e-05 


19.8 




343-449 


992 


TelA 


i oxic anion resistance piuiciu y i *>Lr*.j 


0.75 


3.2 




405-437 


992 


spectrin 


Spectrin repeat 


2e-06 


23.7 




452-538 


992 


spectrin 


Spectrin repeat 


3.1e-13 


47.2 




781-888 


992 


DCP2 


Dcp2, box A domain 


0.57 


4.2 




823-837 


992 


MutSJI 


MutS domain II 


0 91 


3.5 




840-869 


992_, 


SAA proteins 


Serum amyloid A protein 


0 07 


6.0 




866-883 


993 


LysE 


LysE type translocator 


0 02 


8.8 




127-147 


994 


Collagen 


Collagen triple helix repeat <zu copi 




27 7 




76-118 


994 


Clq 


Clq domain 


8e-32 


115.9 




160-284 


995 


Allantoicase 


Allantoicase repeat 




257.1 




1-136 


995 


Allantoicase 


Allantoicase repeat 


O.UG-JO 


1Q7 5 




159-319 


996 


DNAJigase_A_ 
C 


ATP dependent DNA ligase C terminal 
r 


V/.O / 


5 4 




11-34 


996 


ifi 


Immunoglobulin domain 


0 00019 


19.5 




37-151 


996 


iS 


Immunoglobulin domain 


V. 1J 


8.7 




182-243 


996 


i£ 


Immunoglobulin domain 


0.0031 


15.0 




275-335 


996 


SK_channel 

• 


Calcium-activatea bK potassium 
channe 




7.1 




363-383 


997 


PH 


rrt domain 


2.4e-24 


81.6 




23-133 


997 


HS2ST 


Heparan sunate z-v/-s»uiiuu.<ui&iGiaac; 


0.27 


4.4 




140-162 


997 


LMP 


LMr repeatea region 


0.0012 


14.2 




160-181 


997 


DUF603 


Protein of unknown function, DUF603 


0.04 


6.4 




173-187 


997 


Pox A type inc 


Viral A-type inclusion protein repeat 


0.32 


7.2 




173-187 


997 


IQ 


IQ calmodulin-binding motif 


5e-05 


20.1 




206-226 


997 


RhoGEF 


RhoGEF domain 


1 2e-69 


236.9 




247-428 


997 


DUF674 


Protein 01 unknown runction (uuro i**) 


0 82 


1.4 




275-285 


997 


Stigl 


Stigma-specific protein, Stigl 


n 6 


2.3 




376-421 


997 


PH 


PH domain 




45.3 




460-588 


997 


RasGEFN 


Guanine nucleotide exchange factor fo 


1 le-19 


71.3 




633-688 


997 


RasGEF 


RasGEF domain 


7 2e-89 


305.4 




999- 
1184 


997 


Adeno_terminal 


Adenoviral jjina terminal proiein 


I 


1.7 




1175- 
1207 


998 


DUF630 


Protein ol un Known iuncuon yu\jrijj\jj 


0.7 


4.3 




692-705 


998 


FGF 


riproplast growm iacior 


0.37 


4.4 




728-743 


998 


tRNA-synt_2 


tKJNA syntiietases ciass u ^u, jv <iuu in 


0.74 


3.5 




' 754-766 


998 


Ome^a-atracotox 


Omega-atracotoxin 


0.15 


5.1 




859-866 


999 


K tetra 


lvr channel tetramensanon aomam 


2e-34 


121.3 




26-114 


999 


BTB 


BTB/POZ domain 


0.0015 


14.2 




74-125 


1000 


PXA 


PXA domain 


0 01 


10.2 




84-104 


1000 


Vps52 


Vpsjz / oacz iamuy 


o 


1099. 
3 




' 94-601 


1000 


trp syntA 


Tryptophan synthase alpha chain 


0.78 


3.1 




173-210 


1000 


DUF965 


Bacterial protein of unknown function 


0.33 


4.5 




285-298 


1000 


Vps53 N 


Vps53-like, N-terminal 


0.93 


2.7 




565-585 


1001 


PHD 


PHD-finger 


3.8e-06 


20.3 




1-24 


1001 


rubredoxin 


Rubredoxin 


0.55 


5.9 




14-28 


1001 


Orbi NS3 


Orbivirus NS3 


0.83 


2.8 




435-458 


1001 


NosL 


NosL 


0.29 


4.9 




1297- 
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1321 




NAC 


NAC domain 


0.76 


5.5 


1 


1343- 
1365 




D1JF240 


MG032MG096/MG288 family 2 


0.17 


6.7 


1 


1369- 
1384 


1002 


RecR 


RecR protein 


0.97 


6.3 


! 


104-118 


1002 


zf-C3HC4 


Zinc finder, C3HC4 type (RING finger) 


2e-09 


26.7 


\ 


108-147 






DC1 domain 


0.045 


7.9 




184-213 


1002 


PHD 


PHD-finger 


6.5e-21 


66.9 


2 


185-233 


1 flfY? 


7f-MYND 


MYND finger 


0.7 


4.3 




186-204 




n iKrAfl AY1 Tl 
L UUi vUUAUl 


Rubredoxin 


0.55 


5.9 


1 


223-237 


LUUZ 




Orbivirus NS3 


0.83 


2.8 


1 


644-667 


i no? 

1UUZ 




NosL 


0.29 


4.9 


1 


1506- 
1530 


1002 


NAC 


NAC domain 


0.76 


5.5 




1552- 
1574 


1002 


DUF240 


MG032/MG096/MG288 family 2 


0.17 


6.7 




1578- 
1593 


1003 


Patched 


Patched family 


0.069 


4.7 


4 — 


405-442 


ino^ 


T<5AV HA 


Infectious salmon anaemia virus haema 


0.23 


3.2 




716-738 


100^ 


WD40 

VY 


WD domain, G-beta repeat 


0.00019 


18.3 


i 


767-802 


IUUj 


WT140 


WD domain, G-beta repeat 


0.71 


5.9 


2 


958-992 






WD domain, G-beta repeat 


4.2e-05 


20.6 


3 


1069- 
1104 


1003 


WD40 


WD domain, G-beta repeat 


4.1e-09 


34.6 


4 


1109- 
1145 


1001 


WD40 

VV L/*TV/ 


WD domain, G-beta repeat 


0.0012 


15.6 


5 


1150- j 
1185 


1004 


zz 


Zinc finger, ZZ type 


le-12 


48.2 


1 


3-48 


1004 


SoxD 


Sarcosine oxidase, delta subunit farm 


0.97 


4.2 


I 


77-84 


1004 


zf-C2H2 


Zinc finger, C2H2 type 


0.00067 


20.3 


1 


78-101 


1004 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.3 


3.6 


1 


93-113 


1004 


SPDY 


Domain of unknown function 
(DUF317) 


0.6 


4.4 


1 


117-131 


1004 


Dil9 


Drought induced 19 protein (Dil9) 


0.00056 


13.0 


1 


312-328 


1006 


C2 


C2 domain 


7e-08 


28.1 


1 


189-259 


1006 


HHE 


Domain of Unknown function 


0.13 


7.5 


1 


216-235 


1006 


C2 


C2 domain 


1.3e-18 


64.3 




304-394 


1007 


RmuC 


RmuC family 


0.79 


3.1 


1 


4-34 


1007 


IBN NT 


Importin-beta N-terminal domain 


2.1e-27 


99.5 


1 


22-101 


1007 


Peripla BP like 


Periplasmic binding proteins and suga 


0.21 


4.7 


1 


130-161 


1008 


Lasl 


Lasl -like 


1.6e-94 


320.7 


1 


38-186 


1008 


MuDR 


MuDR family transposase 


0.17 


5.5 


1 


214-246 


1008 


BAR 


BAR domain 


0.21 


5.2 


1 


330-346 


1008 


Adeno E1B 19K 


Adenovirus E1B 19K protein / small t- 


0.43 


4.6 


1 


517-541 


1008 


META 


Domain of unknown function (306) 


0.7 


5.7 


.1 


615-648 


1009 


PH 


PH domain 


4e-18 


61.1 




UO-227 


1009 


HrpF 


HrpF protein 


0.64 


4.5 




248-257 


1009 


ArfGap 


Putative GTPase activating protein fo 


4.8e-38 


133.0 




250-373 


1009 


ank 


Ankyrin repeat 


3.2e-05 


21.8 




411-446 


1009 


ank 


Ankyrin repeat 


0.00019 


19.0 




447-479 


1009 


DMRL_synthase 


6.7-dimethvl-8-ribityllumazine syntha 


0.35 


5.0 




479-494 


1009 


hormone 


Somatotropin hormone family 


0.5 


1.6 




545-561 


1009 


tubul in-binding 


Tau and MAP protein, tubulin-binding 


0.11 


8.0 




828-844 
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Pnci tinn 


1009 


SH3 


SH3 domain 


j.oe-iz 


4A A 
40.0 


1 
1 


881 -918 


1010 


Bromo CP 


Bromovirus coat protein 


U. 10 


j.j 


1 
1 


1-1? 


1011 


ig 


Immunoglobulin domain 


a fin/i 

U.UU4 


1 A < 
14.0 


1 
1 


77-4S 

Z/-'rJ 


1011 




Immunoglobulin domain 


1 Q« AC 

j.ye-uj 


22. 1 


i 

Z 


Ov IHO 


1011 


>8 


Immunoglobulin domain 


j./e-lU 


/1A A 


i 
J 


1 81 74? 

loJ-Z**Z 


1011 


ie 


Immunoglobulin domain 


A AA 1 O 

0.00 18 


1 C A 

15.9 


4 


OH1 14? 
Zo 1-JH-Z 


1011 


ifi 


Immunoglobulin domain 


3.7e-08 


33.4 


5 


379-440 


1011 


DNA_pol_B_2 


DNA polymerase type B, organellar 
and 


0.018 


7.9 


I 


jyo-4jz 


1011 


OapA 


Opacity-associated protein A 


A A A 

0.44 


O A 

2.4 


l 


4A5-477 


1011 


i^ 


Immunoglobulin domain 


a nmo 
0.0012 


lo.o 


o 


474 

*t /*r-J J J 


1011 


ifi 


Immunoglobulin domain 


7.7e-07 


c 

25. J 


/ 


COA £14 


1013 


denso VP4 


Capsid protein VP4 


A 11 

0.23 


J.4 


i 
1 


1£^ 18<\ 
100- 10 J 


1015 


efhand 


EF hand 


2.8e-08 


33.9 


1 


29-57 


1015 


COX17 


Cytochrome C oxidase copper 
chaperone 


A A 1 

0.42 


A O 

4.2 


1 
1 


^4 A1 


1015 


efhand 1 


EF hand 


0.0033 


15.3 


2 


65-93 


1015 


efhand J 


EFhand 


o c— f\e 

8.5e-05 


O 1 1 

21.1 


j 


1AO 110 
1UZ-1 j\J 


1015 


PCRF 


PCRF domain 


0.43 


6.1 


1 


129-145 


1015 


DUF21 


Domain of unknown function DUF21 


0.18 


6.4 


1 


1 1A 1 <Q 

1j4-1jo 


1015 


efhand 


EF hand 


5e-09 


36.7 


4 


138-166 


1016 


UPF0061 


Uncharacterized ACR, YdiU/UPF0061 
fam 


3.9e-74 


256.4 


1 


2-2 fy 


1016 


Flavodoxin 2 


Flavodoxin-like fold 


0.66 


3.3 


1 


373-388 


1016 


UPF0061 


Uncharacterized ACR, YdiU/UPF0061 
fam 


1.2e-05 


19.1 


2 


A (\1 AAA 

4UJ-444 


1017 


UPF0061 


Uncharacterized ACR, YdiU/UPF006i 
fam 


le-39 


1 A A A 

140.9 


1 


1 1Q OC1 

1 IV-ZJ J 


1017 


UPF0061 


Uncharacterized ACR, YdiU/UPF0061 
fam 


£ O ~ CI 

o.8e-52 


1 OO A 

ioZ.O 


i 
Z 


41 1 J> 1 1 


1017 


Flavodoxin 2 


Flavodoxin-like fold 


U.oo 


i i 

J.J 


i 
1 


70S-770 


1017 


UPF0061 


Uncharacterized ACR, YdiU/UPF0061 
fam 


1 1«. AC 


1 O 1 

iy. i 


i 


71S-776 
/ jj- / /u 


1018 


7tm 1 


7 transmembrane receptor (rhodopsin f 


1 1 a 88 

1. ie-oo 


764 f\ 


1 

L 


87-350 


1018 


DUF395 


YeeE/YedE family (DUF395) 


a 04 


A 7 


1 


188-205 


1019 


LRRNT 


Leucine rich repeat N-terminal domain 


a 1? 

U.1Z 


7 7 


1 
1 


42-56 


1019 


LRR 


Leucine Rich Repeat 


U. 1Z 


q o 
o.z 


1 

1 


82-105 


1019 


LRR 


Leucine Rich Repeat 


a nniQ 


14 1 


3 


133-157 


1019 


LRR 


Leucine Rich Repeat 


a ni i 

U.U 1 J 


1 l.U 


4 


158-181 


1019 


LRR 


Leucine Rich Repeat 


n aaa?i 

U.UUUZJ 


17 S 
i / . j 


5 


182-205 


1019 


LRR 


Leucine Rich Repeat 


a 11 

U.J1 




u 


206-226 


1019 


LRR 


Leucine Rich Repeat 


A 07 


7 4 
/ .** 


8 
o 


251-272 


1019 


LRR 


9/18 273 283.. 1 11 


A AAA^7 


1£ 1 
10.1 


in 


329-352 


1019 


LRR 


9/18 273 283.. 1 11 


A AA/1 

U.UU4 


11 1 
I J.J 


17 

1Z 


0,77-402 


1019 


LRR 


9/18 273 283.. 1 11 


A AA1 1 


1 A 0 


11 
1J 


40^-426 






Q/1R 27^ 283 1 11 


0.27 


7.1 


14 


427-439 


1019 


LRR 


9/18 273 283.. 1 11 


0.16 


7.9 


15 


463-484 


1019 


LRR 


9/18 273 283.. 1 11 


0.8 


5.5 


16 


486-510 


1019 


LRR 


9/18 273 283.. 1 11 


0.035 


10.1 


17 


537-558 


1019 


TIMELESS 


Timeless protein 


0.45 


3.0 


I 


553-568 


1019 


LRR 


9/18 273 283.. 1 11 


0.084 


8.8 


18 


559-582 


1020 


AMP-binding 


AMP-binding enzyme 


4.5e-49 


173.2 


1 


1-177 


1020 


RNA poLRpc4 


RNA polymerase m RPC4 


0.62 


4.2 


1 


189-199 


1020 


Phage 30 8 


Phage GP30.8 protein 


0.92 


2.6 


1 


233-253 
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Ejvalue 


Score 


Repeats 


Position 


1021 


SKIPJSNW 


SKIP/SNW domain 


0.3 


4.7 


i 


92-113 


1021 


cNMP binding 


Cyclic nucleotide-binding domain 


0.55 


5.2 




102-132 


1021 


cytochromes 


Cytochrome c 


0.92 


3.7 


i 


313-329 


1021 


cNMP_bmding 


Cyclic nucleotide-binding domain 


1.5e-15 


57.4 




345-435 


1021 


RasGEFN 


Guanine nucleotide exchange factor fo 


0.00023 


17.5 




460-504 


1021 


Pseu avirulence 


Avirulence protein 


0.91 


1.9 




491-504 


1021 


PDZ 


PDZ domain (Also known as DHR or 
GLGF 


5.2c-19 


68.7 


1 


580-661 


1021 


RA 


Ras association (RalGDS/AF-6) 
domain 


2.6e-08 


32.5 


1 


r\ f\ jr f\ r\ F* 

806-885 


1021 


RasGEF 


RasGEF domain 


2.7e-48 


170.6 


i 


907- 
1092 


1022 


SKIP SNW 


SKIP/SNW domain 


0.3 


4.7 


i 


42-63 


1022 


cNMP binding 


Cyclic nucleotide-binding domain 


0.55 


5.2 




52-82 


1022 


cytochromes 


Cytochrome c 


0.92 


3.7 


i 


263-279 


1022 


cNMP binding 


Cyclic nucleotide-binding domain 


1.5e-15 


57.4 




295-385 


1022 


RasGEFN 


Guanine nucleotide exchange factor fo 


0.00023 


17.5 




410-454 


1022 


Pseu avirulence 


Avirulence protein 


0.91 


1.9 


i 


441-454 


1022 


PDZ 


PDZ domain (Also known as DHR or 
GLGF 


5.2e-19 


68.7 




530-611 


1022 


RA 


Ras association (RalGDS/AF-6) 
domain 


2.6e-08 


32.5 


1 


756-835 


1022 


RasGEF 


RasGEF domain 


2.7e-48 


170.6 


1 


857- 
1042 


1026 


Ricin BJectin 


QXW lectin repeat 


0.14 


8.3 


1 


134-161 


1026 


MCR_beta_N 


Methyl-coenzyme M reductase beta 
subu 


0.98 


2.1 


1 


152-160 


1026 


Ricin B lectin 


QXW lectin repeat 


4.5e-07 


28.1 




196-225 


1026 


Ricin B lectin 


QXW lectin repeat 


0.0012 


15.8 




22o-2oD 


1027 


SCF 


Stem cell factor 


2.9e- 
155 


512.2 


1 

— 


1-214 


1027 


FH2 


Formin Homology 2 Domain 


0.027 


o o 

8.8 


— 




1027 


Herpes JJL7 


Herpesvirus UL7 like 


0.072 


7.6 


_J 


1 H£ 0 1 ^ 


1028 


cadherin 


Cadherin domain 


3.4e-12 


44.2 




DKhlJ 1 


1028 


cadherin 


Cadherin domain 


1.7e-22 


OA 1 

oU. 1 


2 




1028 


cadherin 


Cadherin domain 


6e-20 


71.3 


D 




1028 


cadherin 


Cadherin domain 


5.9e-21 


74.8 


4 


379-452 


1028 


cadherin 


Cadherin domain 


0.0035 


12.8 


c 

5 


52100/ 


1029 


cadherin 


Cadherin domain 


3.4e-12 


44.2 






1029 


cadherin 


Cadherin domain 


1.7e-22 


80.1 




155-250 


1029 


cadherin 


Cadherin domain 


6e-20 


71.3 


3 


264-342 


1029 


cadherin 


Cadherin domain 


1.8e-22 


80.0 




<-> mf\ A '~tf\ 

379-470 


1029 


cadherin 


Cadherin domain 


0.0035 


12.8 


5 


483-529 


1030 


Troponin 


Troponin 


0.87 


3.1 


1 


21-117 


1030 


Mycoplasma_M 
AA2 


Mycoplasma arthritidis MAA2 repeat 


0.65 


3.7 


1 


518-527 


1030 


PH 


PH domain 


6.5e-14 


47.1 




522-624 


1030 


DUF1041 


Domain of Unknown Function 
(DUF1041) 


3.4e-79 


273.2 




738-950 


1030 


AUene_ox_cyc 


Allene oxide cyclase 


0.7 


2.8 




817-852 


1031 


Renal_dipeptase 


Renal dipeptidase 


1.9e- 
108 


370.4 




74-354 


1031 


Amidase 3 


N-acetylmuramoyl-L-alanine amidase 


0.76 


3.8 




222-234 


1032 


Trp Tyr penn 


Tryptophan/tyrosine permease family 


0.0026 


10.3 




42-63 
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1 A "30 


oo nPrmPSQPQ 

33 pCilllCaoCo 


Amino acid nermease 


8.4e-32 


115.8 


1 


48-371 


1 AT) 

1032 


rOX ID 


PnYvirn«j nrotein 15 


0.24 


6.0 




162-179 


1032 


senne^cdiupcpi 


Serin** rarhoxvneptidase 


0.41 


2.3 


1 


378-398 


1033 


tuc nHO pyh 

1 Hr unu^v/ 1 Jul 


Tetrnh vd rofol ate dehydrogenase/ c ycloh 


0.027 


6.2 


I 


89-108 


1033 


xup nHpr pvh 


Tptrflhvdrnfolate dehvdrotienase/cycloh 


6.1e-13 


37.3 




119-180 


1033 


THF_DHG_CYH 


Tetrahydrofolate dehydrogenase/cycloh 


6.5e-07 


25.7 


1 


182-229 


1033 


r 1 tiro 


Rnrmate— tetrahvdrofolate licase 


0 


1365. 
1 


1 


360-979 


1U34 


a si rtVirvcnfmt 
aClU pilUopilClt. 


Histidine acid phosphatase 


0.038 


6.9 


1 


378-394 


10^4 
1U34 


py\r rf»d 


NADPH-dependent FMN reductase 


0.94 


3.3 


1 


425-446 


IU34 


O.C1U pilUopilov 


Histidine acid phosphatase i 


0.02 


7.9 




512-581 


1 A^A 


PihAcomJil T /i 


Ribosomal protein L6 


0.21 


7.2 


1 


760-800 


i a^ 


PH7 


PDZ domain (Also known as DHR or 
GLGF 


3.2e-l4 


51.8 


1 


47-111 






Protein of unknown function DUF62 


1 


2.7 


-j 


71-91 


1035 


AraC binding 


AraC-like ligand binding domain 


0.99 


3.9 


-j 


139-198 


1 A75 
1U3j 


ArmaQlUQ_oCg 


Armadillo/beta-catenin-like repeat 


0.97 


5.6 




170-187 


1 A1< 


upv >J<JA?i 


Henatitis C virus non-structural prot 


0.057 


8.8 


1 


319-348 


1 A1^ 


I? ncfr A P 


GTPase-activator protein for Ras-like 


0.37 


3.6 


1 


764-783 


1 A15 

103j 




PhoORF domain 


1.7e-28 


97.2 


1 


778-962 


1 

IKJJJ 


QT47 


STT2 domain 


0.98 


3.2 


1 


819-829 


1 

1U3D 


PR 
r n 


PfT domain 

1 11 UUlllulll 


4.2e-05 


18.0 


1 


1006- 
1119 




ocir in 


Selenoprotein P, N terminal region 


0.25 


3.7 


1 


1112- 
1138 


1037 


PH 


PH domain 


6.7e-14 


47.1 


4 


17-124 


1 A^7 




EF hand 


9.2e-05 


21.0 




138-166 


10^7 




EF hand 


. 0.0023 


15.8 




174-202 


10^7 


PT-pr c-x 


Phosphatidylinositol-specific phospho 


5.9e-17 


60.5 


1 


291-326 


IVJJO 


DIJF765 


Circovirus protein of unknown functio 


0.85 


3.7 


1 


274-302 




ARCt tran^nort 


AbgT putative transporter family 


0.81 


1.2 


1 


13-26 




7tm 1 


7 transmembrane receptor (rhodopsin f 


7.4e-29 


85.1 


1 


40-289 


io^q 

1VJ J7 


HKCT 


HECT-domain (ubiquitin-transferase) 


0.15 


5.5 


1 


273-290 


1040 


TSPN 


Thrombospondin N-terminal -like 
domai 


1.4e-41 


136.6 


1 


1-101 


1040 


TIL 


Trypsin Inhibitor like cysteine rich 


0.66 


3.9 


1 


195-239 


1040 


EGF 


EGF-like domain 


0.0046 


13.8 


1 


199-233 


1040 


Baculo LEF-3 


Nucleopolyhedrovirus late expression 


0.0024 


10.4 


1 


230-244 


1040 


EGF 


EGF-like domain 


0.51 


6.4 




239-269 


1040 


Mu-conotoxin 


Mu-Conotoxin 


0.63 


5.2 


1 


283-304 


1040 


dickkoof N 


Dickkopf N-terminal cysteine-rich reg 


0.94 


3.5 


1 


292-299 


1040 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.45 


5.6 


1 


311-327 


1040 


FGF 

ErVJI 


EGF-like domain 


4.3e-05 


21.1 


4 


' 333-366 


1040 




Thrombospondin type 3 repeat 


0.00027 


16.9 


1 


405-417 


1040 


tsp 3 


Thrombospondin type 3 repeat 


0.032 


10.2 


2 


418-433 


1040 


tsp 3 


Thrombospondin type 3 repeat 


0.0046 


13.0 


3 


441-453 


1040 


tsp 3 


Thrombospondin type 3 repeat 


0.00087 


15.3 


4 


464-476 


1040 


tsp 3 


Thrombospondin type 3 repeat 


0.023 


10.7 


5 


477-492 


1040 


tsp 3 


Thrombospondin type 3 repeat 


0.00058 


15.9 


6 


500-512 


1040 


tsp 3 


Thrombospondin type 3 repeat 


0.0033 


13.4 


7 


523-535 


1040 


tsp 3 


Thrombospondin type 3 repeat 


0.0011 


15.0 


8 


538-553 


1040 


tsp 3 


Thrombospondin type 3 repeat 


0.00057 


15.9 


9 


561-573 


1040 


tsp 3 


Thrombospondin type 3 repeat 


0.0015 


14.6 


11 


601-613 
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Score 


Repeats 


Position 


1U4U 




Thrombosoondin tvoe 3 repeat 


0.03 


10.3 


12 


614-629 


1 (\A A 
IU4U 




ThromhosDondin C-terminal region 


7.le- 
176 


594.4 


I 


654-854 


1040 


Mndl 


Mndl family 


0.68 


3,4 




853-861 


IU4Z 


0000 U 


Cnnner/7inc suoeroxide disniutase 
(SOD 


I 


2.0 


1 


31-44 


104? 


LJCLVJU 


Dihydrodipicolinate reductase, C-term 


0.84 


4.5 


1 


38-52 




PTR9 


POT family 


3e-55 


193.1 


1 


103-335 


1U4Z 


PTR? 


POT family 


1.5e-36 


127.7 




336-471 


104? 


AHeno PIX 


Adenovirus hexon-associated protein ( 


0.76 


3.8 


1 


493-508 




Drf GBD 


Diaphanous GTPase-binding Domain 


1.7e-60 


211.1 


1 


40-229 


1043 


DUF1000 


Domain of Unknown Function 
(DUFIOOO^ 


0.79 


3.3 


1 


141-157 




TVf FH1 


Dianhanous FH3 Domain 


7.2e-71 


244.6 


1 


231-437 




C onfi crf*r\ 
o-alUlgCIl 


S-antiffen nrotein 


0.24 


3.2 


1 


429-436 




H7TP 


bZIP transcription factor 


0.88 


4.7 


1 


440-478 


1041 


eRFl 1 


eRFl domain 1 


0.55 


4.9 


1 


480-502 


1041 


THASF3 


CHASE3 domain 


0.025 


9.5 


1 


489-517 


1041 


PrtY A fvnft inc 


1/4 449 468 .. 4 23 


0.14 


8.3 




495-517 


1041 


rnz. 


Formin Homology 2 Domain 


6.2e- 
154 


521.5 


1 


596-969 


1041 


DUF387 


Putative transcriptional regulators ( 


0.0045 


10.2 


1 


777-800 


1041 


EMP24 GP25L 


emp24/gp25L/p24 family 


0.18 


5.7 


1 


868-899 


1041 


IE68 


Herpesvirus immediate early protein 


0.25 


6.4 


1 


882-920 


1044 


KRAB 


KRAB box 


2.9e-27 


100.7 


1 


8-48 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


0.00026 


21.9 


1 


114-136 


1044 


XPA N 


XPA protein N-terminal 


0.51 


5.7 


1 


139-151 


1044 


TFIIS 


Transcription factor S-II (TFIIS) 


0.16 


7.4 


2 


142-152 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


6.7e-06 


28.3 


2 


142-164 


1044 


XPA N 


XPA protein N-terminal 


0.49 


5.8 


2 


167-179 


1044 


TFIIS 


Transcription factor S-II (TFIIS) 


0.18 


7.2 


3 


170-180 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


4e-06 


29.2 


3 


170-192 


1044 


TFIIS 


Transcription factor S-II (TFIIS) 


0.5 


5.7 


4 


198-208 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


6.3e-05 


24.4 


4 


198-220 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


7.8e^07 


32.1 


5 


226-248 


1044 


XPA N 


4/13 223 235.. 1 13 


0.45 


5.9 


5 


251-263 


1044 


eIF5 eIF2B 


Domain found in IF2B/IF5 


0.95 


3.5 


1 


254-264 


1044 


TFIIS 


Transcription factor S-II (TFIIS) 


0.069 


8.7 


6 


254-264 


1044 


Transposase 12 


Transposase 


0.11 


5.8 


1 


254-280 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


5.9e-07 


32.6 


6 


254-276 


1044 


zf-BED 


BED zinc finger 


0.14 


6.9 


1 


255-277 


1044 


XPA N 


4/13 223 235.. 1 13 


0.15 


7.6 


6 


279-291 


1044 


TFIIS 


Transcription factor S-II (TFIIS) 


0.15 


7.5 


7 


282-292 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


8.3e-07 


32.0 


7 


282-304 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


4.6e-06 


29.0 


8 


310-332 


1044 


zf-C2H2 


Zinc ringer, C2H2 type 


5.3e-06 


28.7 


9 


338-360 


1044 


zf-C2H2 


Zinc finger, C2H2type 


4e-06 


29.2 


10 


366-388 


1044 


XPA N 


4/13 223 235.. 1 13 


0.84 


5.0 


8 


391-403 


1044 


TFIIS 


10/18 368 376.. 31 39 


0.25 


6.8 


11 


394-404 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


2.1e-06 


30.3 


11 


394-416 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


7.5e-06 


28.1 


12 


422-444 


1044 


Bvrl Air 


Ervl/ Air family 


0.48 


5.4 


1 


442-460 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


9.9e-07 


31.7 


13 


450-472 


1044 


PqiA 


1 Paraquat-inducible protein A 


0.025 


8.9 


2 


469-500 
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E_value 
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Repeats 


Position 


1044 


XPA N 


4/13 223 235.. 1 13 


0.45 


5.9 


9 


475-487 


1044 


eIF5 eIF2B 


Domain found in IF2B/IF5 


0.95 


3.5 


2 


478-488 


1044 


TFIIS 


12/18 424 432,. 31 39 


0.069 


8.7 


14 


478-488 


1044 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-06 


30.2 


14 


478-500 


I ftAA 




Zinc finger, C2H2 type 


6.2e-05 


24.4 


15 


506-528 


1044 


7f-P9H2 


Zinc finger, C2H2 type 


l.le-07 


35.5 


16 


534-556 






12/18 424 432.. 31 39 


0.78 


5.1 


15 


536-544 


1 (\AA 




12/18 424 432.. 31 39 


0.12 


7.9 


16 


562-572 


lU*t*r 


7f-P9H2 


Zinc finger, C2H2 type 


2.5e-06 


30.0 


17 


562-584 


1044 


TFTK 
i r no 


12/18 424 432.. 31 39 


0.25 


6.8 


17 


590-600 


1 (\AA 


7f P9H? 


Zinc finger, C2H2 type 


6.2e-08 


36.5 


18 


590-612 




wIIlUld-VLL up 

M 


Umbravirus long distance movement 
(LD 


0.56 


2.8 


1 


601-626 


1044- 


TFIIS 


12/18 424 432.. 31 39 


0.062 


8.8 


18 


618-628 


1044 


7f-C2H2 


Zinc finger, C2H2 type I 


3.6e-07 


33.4 


19 


618-640 


1044 


zf-BED 


3/7 423 445.. 24 52 


0.027 


9.3 


7 


619-641 


1045 


Snrontv 


Sprouty protein (Spry) 


1.2e-17 


55.0 


1 


33-70 


104S 




Sprouty protein (Spry) 


2.7e-10 


31.5 


2 


73-90 


1 046 


U A MP 

JTLrViviX 


HAMP domain 


0.21 


7.3 


1 


9-42 


1046 


PA 


PA domain 


3.6e-19 


65.4 


1 


155-255 


1046 


PpntiHase M28 


Peptidase family M28 


1.4e- 
120 


410.8 


1 


332-585 


1046 


Rorrelia Hoo 


Borrelia burgdorferi virulent strain 


0.98 


2.5 


1 


591-604 


1046 


TFR dimer 


Transferrin receptor-like dimerisatio 


le-65 


228.5 


1 


597-739 


1047 


GvpG 


Gas vesicle protein G 


0.088 


6.7 


1 


17-49 


1048 


Sema 


Sema domain 


3.2e-08 


29.3 


1 


34-127 


1048 


ABM 


Antibiotic biosynthesis monooxygenase 


0.74 


5.7 


1 


192-208 


104R 


Spmsi 

O vl lid 


Sema domain 


2.3e-06 


22.6 


2 


386-449 


1048 


PSI 


Plexin repeat 


2.3e-20 


65.3 


1 


468-519 


104R 




Plexin reneat 


1.4e-12 


41.0 


2 


759-801 


1048 


TIG 


IPT/TIG domain 


1.6e-20 


78.3 


1 


803-893 


1 f\AQ 




rPTATlfr domain 


4.5e-19 


73.5 


2 


895-980 


1048 


TIG 


IPT/TIG domain 


3.3e-13 


51.7 


3 


983- 
1092 




^yOmpciencc 


r^nmnptptipe nrotein 


0.77 


3.3 


1 


1181- 
1224 


1048 


RNB 


RNB-like protein 


0.064 


6.4 


1 


1389- 
1412 


104R 


Fimhrial KR8 
r mi vi iai___i\voo 


Fimbrial, major and minor subunit 


0.15 


5.4 


1 


1461- 
1470 


104R 


UU14UUUI 


irhinnitin familv 


0.021 


10.4 


-1 


1463- 
1497 j 


1 OAQ 


OLD 


RTB/POZ domain 


4.5e-28 


102.9 


1 


20-124 


10^0 


ADVy ud.Il 


ABC transoorter 


8.3e-40 


134.3 


1 


26-217 


10SO 




Domain of Unknown Function 
(DUF908) 


0.55 


3.1 


1 


69-83 


1050 


RhoGAP 


RhoGAP domain 


0.058 


7.1 




69-82 


1050 


Chlamydia_PMP 


Chlamydia polymorphic membrane 
protei 


0.63 


2.9 




546-565 


1051 


ZZ 


Zinc finger, ZZ type 


le-12 


48.2 




3-48 


1051 


SoxD 


Sarcosine oxidase, delta subunit fami 


0.97 


4.2 




77-84 


1051 


zf-C2H2 


Zinc finger, C2H2type 


0.00067 


20.3 




78-101 


1051 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.3 


3.6 




93-113 


1051 


SPDY 


Domain of unknown function 


0.6 


4.4 




117-131 ; 
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D/lCltl ATI 






(DUF3 17) 










1051 


Dil9 


Drought induced 19 protein (Dil9) 


o ooos^ 


n o 

i J.U 


-j 


3 12-328 


1052 


ig 


Immunoglobulin domain 


7*»-19 

/C"l_« 


47 4 
*+/,*+ 


T " 


34-110 


1052 


MyTH4 


MyTH4 domain 


o izt 


J.O 


1 




141-152 


1052 


if? 


Immunoglobulin domain 


n oo?i 


i c 

i j.j 


— 


1 50-204 


1053 


kringle 


Kringle domain 


o oil 


R 7 
o. / 






1053 


WSC 


WSC domain 


2.3e-06 


25.6 


i 


89-142 


1053 


CUB 


CUB domain 


1 /la 1 < 


<A C 

jZ.J 




\Cf. *)£.C\ 


1054 


ig 


Immunoglobulin domain 


1 To o^ 
l . /e-uj 


Z3.4 


1 


JO- 1 1 J 


1054 


Phage cap E 


Phage major capsid protein E 


0.79 


2.8 


i 


127-136 


1055 


MHC I 


Class I Histocompatibility antigen, d 


le-14o 


/I AT 

497. o 


J 


AC AAT 

Zj-ZUJ 


1055 


DUF497 


Protein of unknown function (DUF497) 


0.2 


6.7 


-J 


43- JO 


1055 


ig 


Immunoglobulin domain 


X? Vl — AO 

5.4e-09 


ICC 

36.5 




AAA AQC 

zzU-Zoj 


1055 


DUF395 


YeeE/YedE family (DUF395) 


0.18 


7.2 


i 


310-335 


1056 


LBP_BPI_CETP 
C 


LBP / BPI / CETP family, C-terminal d 


5.8e-05 


1 O 1 

I8.l 




C< 1 TO 

jo-i3y 


1057 


LBP BPI CETP 


LBP / BPI / CETP family, N-terminal d 


5.1e-38 


130.2 


i 


22-185 


1057 


LBP BPI CETP 
C 


LBP / BPI / CETP family, C-terminal d 


1 £. — 1 A 

l.oe-lz 


AC 1 

45.3 




AQ1 A1Q 

zy i-4zy 


1057 


Peptidase M20 


Peptidase family M20/M25/M40 


A 1 

0.3 


/I A 

4,0 


-J 


-1 OA /II A 


1058 


LBP BPI CETP 


LBP / BPI / CETP family, N-terminal d 


C 1 — . 1 o 

5.1e-38 


i ir\ a 
130.Z 


-j 


ZZ- loj 


1058 


LBP_BPI_CETP 
C 


LBP / BPI / CETP family, C-terminal d 


C O „ AC 

5.8e-05 


1 O 1 

18.1 


1 

— 


zy i - J /4 


1059 


LBP BPI CETP 


LBP / BPI / CETP family, N-terminal a 


_ 1*> A 1 

i.ie-41 


141. 1 




10-901 


1059 


LBP_BPI_CETP 
C 


LBP / BPI / CETP family, C-terminal d 


5.8e-05 


18.1 


i 


307-390 


1060 


Secretogranin__V 


Neuroendocrine protein 7B2 precursor 


2.2e- 

1 1A 

134 


456.6 


i 


1-204 1 


1060 


Ribosomal L19e 


Ribosomal protein L19e 


0.7 


3.6 


i 


167-193 


1062 


PMP22 Claudin 


PMP-22/EMP/MP20/CIaudin family 


6.9e-4o 


1 CA "2 


-J 


A 181 


1062 


Acyl transf 3 


Acyltransferase family 


A 1 A 

0.12 


6.3 




1 0A 1 c 1 
1UO-1 J 1 


1063 


Ribosomal L29e 


Ribosomal L29e protein family 


0.0025 


12.9 


i 


21-49 


1064 


PDZ 


_- _-. _— m • * x . 1 r\TTTl 

PDZ domain (Also known as DHR or 
GLGF 


7.6e-ll 


/1A A 

40.0 




l-o4 


1064 


PDZ 


PDZ domain (Also known as DHR or 
GLGF 


4.2e-l0 


37.4 


2 


209-297 


1064 


PDZ 


PDZ domain (Also known as DHR or 
GLGF 


A / -. 1 /C 

2.4e-lo 


jy.3 


a 
J 


110 101 


1064 


CBM 11 


Carbohydrate binding domain (family 1 


O.lo 


C 1 

j.i 


1 
1 


160-178 


1064 


PDZ 


*-\ t*v — r 1 _* / a 1 1 . ___ __ TXT 1 1 } 

PDZ domain (Also known as DHR or 
GLGF 


n lex 1 o 


OO. 1 


*r 


409-490 


1064 


DUF390 


i-% j • _ £ i XI _ •« /T~\T T_7_ AA\ 

Protein of unknown function (Durlyv) 


o so 

U.oZ 


n 7 

u. / 


1 
1 


534-555 


1064 


PDZ 


PDZ domain (Also known as DrlK or 
GLGF 


z.oe-uy 


1A ft 

J*T.O 


J 


694-775 


1065 


PID 


Phosphotyrosine interaction domain (P 


j.je-**/ 


lOu. J 


1 
i 


42-168 


1UOO 


l«nlonf APtrl ' 1 " 

\jaiaCiosyi_ i 


On In at ac vl tro n cFptji qp 


0.17 


5.8 


1 


106-116 


1066 


Chorismate synt 


Chorismate synthase 


0.8 


1.6 


1 


291-298 


1067 


pkinase 


Protein kinase domain 


le-73 


255.1 


1 


12-272 


1068 


lipocalin 


Lipocalin / cytosolic fatty-acid bind 


7e-38 


136.0 


1 


38-186 


1068 


Triabin 


Triabin 


0.0018 


12.1 


1 


119-136 


1069 


lactamase B 


Metallo-beta-lactamase superfamily 


1.7e-21 


80.0 


1 


11-172 


1070 


annexin 


Annexin 


9.9e-30 


107.8 


1 


58-124 


1070 


annexin 


Annexin 


6.3e-33 


119.1 


2 


130-196 


1070 


annexin 


Annexin 


9.7e-28 


100.7 


3 


213-280 
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1070 


annexin 


Annexin 


4.3e-33 


119.8 


4 


289-355 


1071 


SNF 


Sodiumineurotransmitter symporter 
fam 


0 


1200. 
7 


1 


44-574 


1071 


ATP-sulfurylase 


ATP-sulfurylasc 


0.28 


3.8 


1 


198-220 


1071 


DUF900 


Protein of unknown function (DUF900) 


0.98 


2.8 


1 


408-420 


1072 


ig 


Immunoglobulin domain 


5.7e-06 


25.2 


1 


38-122 


1073 


Glypican 


Glypican 


3.9e- 
292 


979.9 




3-566 


1074 


zf-C2H2 


Zinc finger, C2H2 type 


1 


7.4 




16-40 


1074 


rnri 


RNA recognition motif. (a.k.a. RRM, R 


9.2e-10 


36.8 




58-123 


1074 


PAP assoc 


PAP/25A associated domain 


1.6e-14 


51.8 


i 


490-549 


1074 


Isochorismatase 


Isochorismatase family 


0.49 


4.1 


i 


700-736 


1075 


PgpA 


Phosphatidylglycerophosphatase A 


0.92 


3.0 


1 


12-27 


1075 


SLT 


Transglycosylase SLT domain 


0.23 


6.2 


-\ 


82-112 


1075 


cNMP_binding 


Cyclic nucleotide-binding domain 


0.67 


4.9 




173-196 


1075 


Glyco transf_29 


Glycosyltransferase family 29 (sialyl 


l.6e-22 


77.6 


~-j 


289-506 


1076 


Sec23 trunk 


Sec23/Sec24 trunk domain 


0.47 


4.0 




42-53 | 


1076 


Hydrolase 


haloacid dehalogenase-like hydrolase 


0.77 


3.7 




46-76 


1078 


A2M_N 


Alpha-2-macroglobulin family N- 
termin 


4.5e-91 


312.7 


! 


6-613 


1078 


Big^l 


Bacterial Ig-like domain (group 1) 


0.62 


3.9 




382-403 


1078 


A2M 


Alpha-2-macroglobulin family 


6.2e-64 


214.2 


I 


721-949 


1078 


A2M 


Alpha-2-macroglobulin family 


6.2e- 
132 


444.2 




983- 
1469 


1078 


Pox_D2 


Pox virus D2 protein 


0.18 


3.4 


I 


1446- 
1461 


1079 


A2M_N 


Alpha-2-macroglobulin family N- 
termin 


1.5e-92 


317.7 


1 


19-626 


1079 


BigLj 


Bacterial Ig-like domain (group I) 


0.62 


3.9 




395-416 


1079 


A2M 


Alpha-2-macroglobulin family 


3e-44 


147.6 




735-836 


1079 


kringle 


Kringle domain 


0.077 


7.4 


1 


840-859 


1080 


A2M_N 


Alpha-2-macroglobulin family N- 
termin 


2e-65 


227.5 


1 


6-548 


1080 


BigJ 


Bacterial Ig-like domain (group 1) 


0.62 


3.9 




382-403 


1081 


A2M_N 


Alpha-2-macroglobulin family N- 
termin 


4.5e-91 


312.7 


1 


6-613 


1081 


Big 1 


Bacterial Ig-like domain (group 1) 


0.62 


3.9 




382-403 


1081 


A2M 


Alpha-2-macroglobulin family 


6.2e-64 


214.2 


1 


721-949 


1081 


A2M 


Alpha-2-macroglobulin family 


3.2e- 
137 


462.1 




983- 
1469 


1081 


Pox_D2 


Pox virus D2 protein 


0.18 


3.4 




1446- 
1461 


1082 


A2M_N 


Alpha-2-macroglobulin family N- 
termin 


1.5e-92 


317.7 


1 


6-613 


1082 


Big^l 


Bacterial Ig-like domain (group 1) 


0.62 


3.9 


1 


382-403 


1082 


A2M 


Alpha-2-macroglobulin family 


3e-44 


147.6 




722-823 


1082 


kringle 


Kringle domain 


0.077 


7.4 




827-846 


1083 


COesterase 


Carboxylesterase * 


1.4e- 
192 


649.9 




8-547 


1083 


A2M_N 


Alpha-2-macroglobulin family N- 
termin 


0.83 


2.3 




12-28 


1084 


EGF 


EGF-like domain 


2.8e-05 


21.8 




192-219 


1084 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.37 


5.9 




208-220 


1084 


Tautomerase 


Tautomerase enzyme 


0.14 


5.5 




292-318 
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1 084 


EGF 


EGF-like domain 


9.9e-07 


27.0 


2 


404-431 


ios4 

i not 


Arthro defensin 


1/3 199 211 23 36 


0.52 


3.0 


2 


411-423 


1084 


TB 


TB domain 


1.3e-I6 


54.5 


1 


567-610 


10R4 


EGF 


EGF-like domain 


l.le-07 


30.4 


3 


631-666 


1084 


MoeZ MoeB 


MoeZ/MoeB domain 


0.62 


3.6 


I 


676-682 


10R4 


TB 


TB domain 


2.9e-26 


85.2 


2 


688-729 


1084 


EGF 


EGF-like domain 


1.9e-07 


29.6 


4 


878-914 


10R4 


TIT 


Trypsin Inhibitor like cysteine rich 


0.00021 


14.8 


2 


898-920 


1084 


CBM_14 


Chitin binding Peritrophin-A domain 


0.88 


3.8 


1 


901-920 






RfrF-lilrp domain 


1.2e-07 


30.3 


5 


920-956 


1084 


TIL 


Trypsin Inhibitor like cysteine rich 


0.00057 


13.5 


3 


941-962 


1084 


squash 


oQuasn iamuy ocrinc pruicaoc uuiiuu 


0.069 


4.9 


1 


942-969 


1084 


granulin 


Granulin 


0.06 


7.4 


1 


943-958 


1084 


EGF 


CrLrr-iiKe aomam 


0.098 ' 


9.0 


6 


962-983 


1084 


VSP 


Giardia variant-specific surface prot 


0.031 


7.4 ! 


I 


982- 
1003 


1084 


EGF 


EGF-like domain 


5.2e-06 


24.4 


7 


1043- 
1078 


1084 


EGF 


EGF-like domain 


1.6e-06 


26.2 


8 


1084- 
1119 


1084 


TIL 


5/14 1063 1084.. 47 68 


0.98 


.3.3 


6 


1103- 
1125 


1084 


VSP 


Giardia variant-specific surface prot 


0.29 


3.9 


2 


1105- 
1126 


1084 


EGF 


EGF-like domain 


2.1e-06 


25.9 


9 


1125- 
1160 


1084 


TIL 


5/14 1063 1084.. 47 68 


0.0095 


9.6 


7 


1145- 
1166 


1084 


EGF 


EGF-like domain 


5.9e-05 


20.6 


10 


1166- 
1201 


1084 


Plasmod_Pvs28 


Plasmodium ookinete surface protein P 


0.11 


6.2 


1 


1172- 
1210 


1084 


TIL 


5/14 1063 1084 .. 47 68 


0.0044 


10.7 


8 


1187- 
1207 


1084 


EGF 


EGF-like domain 


5.5e-06 


24.3 


11 


1207- 
1243 


1084 


PAD_porph 


Porphyromonas-type peptidyl-arginine 


0.047 


8.4 


1 


1224- 
1234 


1084 


Piasmod_Pvs28 


Plasmodium ookinete surface protein P 


0.57 


3.6 


2 


1226- 
1281 


1084 


TIL 


5/14 1063 1084.. 47 68 


0.016 


8.9 


9 


1228- 
1249 


1084 


VSP 


Giardia variant-specific surface prot 


0.043 


6.9 


3 


1229- 
1249 


1084 


EGF 


EGF-like domain 


4.9e-06 


24.5 


12 


1249- 
1285 






EGF-like domain 


l.le-05 


23.3 


13 


1291- 
1328 


1084 


TB 


TB domain 


8.7e-23 


74.2 


3 


1358- 
1401 


1084 


EGF 


EGF-like domain 


6e-05 


20.6 


14 


1429- 
1466 


1084 


CBM_14 


Chitin binding Peritrophin-A domain 


0.3 


5.3 


2 


1452- 
1472 


1084 


BB 


2/4 962 979.. 1 18 


0.79 


4.0 


3 


1472- 
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1483 


1084 


EGF 


EGF-like domain 


1.3e-07 


30.3 


15 


1472- 
1S07 


1084 


TB 


TB domain 


1.8e-23 


76.3 ; 


4 


1535- 
1577 


1084 


EGF 


EGF-like domain 


3.3e-06 


25.1 


16 


1626- 
1661 


1084 


Plasmod_Pvs28 


Plasmodium ookinete surface protein P 


0.44 


4.0 


3 


1632- 
1708 


1084 


TIT 

TIL 


*HA l/l/Cl 1 [\QA A1 68 
j/14 IUoj 1US4 .. **7 Oo 


0 00068 


11 ? 


n 


1642- 
1667 


1 AO A 


V&r 


vjiaruia vanani-speciiic suriaue piui 


0.96 


2.0 


4 


1646- 
1667 


1Uo4 


ijnp 


CrVjr-iiivc uuniain 


1.2e-07 


30.3 


17 


1667- 
1706 


10R4 


Tlf 


5/14 1063 1084 47 68 


0.85 


3.5 


14 


1690- 
1706 


1086 




Immunoglobulin domain 


5.3e-07 


29.1 


1 


168-232 


lUou 


v^orond in o*t 


r^nrnnnviniQ nfMi-<rfniehjral DfOtein NS 

Viul V/lla V 11 Uo Hull Oil UvlUI Bl pi uivui i »u 


0.47 


3.5 


1 


248-271 


1086 


ig 


Immunoglobulin domain 


0.052 


10.4 


2 


285-347 


1 CiQA 
IVoO 




T7iKr/Mi#"r»tiTi tvr\/» TTT Hnnriuiri 

riUl l/UCl/Llll LjpC ILL \X\JHLaUl 


2.4e-16 


58.5 


1 


373-459 


1086 


fa3 


Fibronectin type III domain 


3.1e-15 


54.7 


2 


501-587 


i no/; 
LUoO 




riDroncciin type in uuniain 


5.5e-19 


67.7 


3 


602-685 


1086 


fh3 


Fibronectin type III domain 


le-12 


45.9 


4 


700-786 


1086 


m3 


riuronectin type ni uomain 




97.2 


5 


802-888 


1086 


OsmC 


OsmC-like protein 


0 S7 


4.1 


1 


984- 
1018 


lUoO 


ig 


immunogtODULin uomam 


0 00041 


18.2 


3 


1133- 
1191 


LUoO 


ig 


iinniuuugiouuiiii uuuicuu 


1.7e-07 


30.9 


4 


1349- 
1405 


lUoO 


A *»orArrv1 \zctn 


A o"<*rol v*?i n 

r\vgvil VI jr dill 


0.56 


4.3 


1 


1411- 
1428 


1087 


KRAB 


KRAB box 


5.8e-25 


92.4 


1 


14-54 


1087 


DUF19 


Domain of unknown function DUF19 


0.044 


5.2 


1 


80-105 


1087 


TFIIS 


Transcription factor S-H (TFIIS) 


1 


4.7 


1 


161-171 


1087 


zf-C2H2 


Zinc finger, C2H2 type 


8.3e-07 


32.0 


1 


161-183 


1087 


XPA N 


XPA protein N-terminal 


0.49 


5.8 


2 


186-198 


1087 


TFIIS 


Transcription factor S-II (TFIIS) 


0.21 


7.0 


2 


189-199 


1087 


zf-C2H2 


Zinc finger, C2H2 type 


2.8e-07 


33.9 


2 


189-211 


1087 


XPA N 


XPA protein N-terminal 


0.47 


5.9 


3 


214-226 


10R7 

LvO / 




Zinc finger C2H2 tvoe 


5.4e-07 


32.7 


3 


217-239 


1087 


7f-BED 


RKD zinc finder 

Uwl/ J**lll\S Ullg^VSA 


0.11 


7.3 


2 


222-240 


1087 

lvO / 


YPA N 


XPA nrotein N-terminal 


0.52 


5.7 


4 


242-254 


1087 


tftk 

i 1 no 


Transcription factor S-II (TFIIS) 


0.32 


6.4 


4 


245-255 


1087 


zf-C2H2 


Zinc finger, C2H2 type 


3.2e-08 


37.5 


4 


245-267 


1087 


zf-BED 


BED zinc finger 


0.16 


6.8 


3 


246-268 


1088 


KRAB 


KRAB box 


5.8e-25 


92.4 




14-54 


1088 


DUF19 


Domain of unknown function DUF19 


0.044 


5.2 




80-105 


1088 


TFIIS 


Transcription factor S-II (TFIIS) 


1 


4.7 




161-171 


1088 


zf-C2H2 


Zinc finger, C2H2 type 


4.1e-08 


37.1 




161-183 


1088 


zf-BED 


BED zinc finger 


0.24 


6.2 




162-184 


1089 


Keratin B2 


Keratin, hifft sulfur B2 protein 


5.6e-21 


73.7 




2-117 


1089 


Keratin B2 


Keratin, high sulfur B2 protein 


0.011 


9.4 


2 


118-170 
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TABLE 4B 



SEQ 
ID I 


Model 


k lar/trmf i ATI 

j-jescripuon 


R value 


Score 


Repeats 


Position 


i aqa 
ioyo 


iveraiin ds> 


T^prarin hioh Qiilfiir R2 nrotein 

tVCldllll) lllgll aUliul XJ t* piuiA^ill j 


5.9e-ll 


38.5 




2-75 


1 aqi 
Ivy I 


iverann da 


K"prarin hiofi <;iilfiir R2 nrotein 


7.5e-06 


20.6 


! 


2-40 


1 AQI 

i\)y I 


iveraiin dz. 


K"prsifin hicrh *5ii1fiir R2 nrotein 


4e-l8 


63.7 




41-144 


t aoi 

ioy i 


W ami'trx R9 

tveraun D£ 


Tfprntin hidh sulfur R9 nrotein 


4e-05 


18.0 




145-205 


1 AOO 

ioyz 


aonyoxo lipase 


afv.fi vaVa! a cp siQQAf iatpH linage repion 


1.9e-32 


117.8 




27-97 


ioy/ 


aPliyOJOiaSC 


a1r\fiQ/f\ptQ hvHrAlaQP fnlH 


9.5e-19 


67.6 


I 


111-388 




aonyojo lipase 


nH-hvdrnln<;p a^neiated linase reffion 


1.9e-32 


117.8 


I 


87-157 


1 AOI 


aonyojoiase i 


alnTia/Kpta fwdrAlacp folH 
dipila/ Udd lljrUllslddw J.VJ1U 


9.5e-19 ! 


67.6 




171-448 


1 AO/1 

101/4 


/tm 5 


"7 fraticmpmKrcinp rpfpntor fmPtfl.nOtroni 
/ u alloIUClllUlaiic icocjjiui ^hi^lciuwli upi 


0.75 


3.2 




24-45 


1 AO/1 


/tm d 


7 tmnQmpmHranp rpeentor fmetabotroni 


0.00057 


14.4 




65-109 


1 AO/1 


uon aensa n on 


f , r»nrt'Anc9tt ati nAmmn 
v^uiiuciioaLiuii uviiiiaiii 


036 


4.2 




157-169 


i ah /i 


/tm j 


*7 trail cm *»mf"\rnnp rpppotAr I AlPtPlhotroni 
/ iiaiisiiieiiiuiaiic icuv^tvii ^uivuw/uuvpi 


2.1e-05 


19.4 




168-271 


1 fiflC 


Tuberin 


i uuenn 


0.59 


0.5 




17-23 


1095 


HA PAT 


r^ior»\/l crl\/r»^rr»1 anvltrano'fpraCP 
J_yiaL>yigLjrL>eiUl dv YllldlloiGlaow 


6.2e-98 


335.5 




38-216 


1 AO/? 


n A Q A 


/^riKK/>r#»11in rpoiilntpH ArAtpin 
VJlOOCrCllul ICgUlalCU (JlUiCLLi 


0.35 


1.3 




22-51 


ioyo 


lectin_c 


T p>r*tin P-tvnp H Am^in 
LvCULlIl V> l_y JJ\5 LlLTlliaill 


8.9e-26 


95.8 


I 


100-208 


1 AQT 

ioy / 




r^iKK#»r*»11in rpmilafpH Arntpin 
vjiuL/cieiiin icguiaicu piutvui 


0.35 


1.3 




22-51 


1 AO*7 

ioy / 


lectin c 


T p^hin P-tx/np HAni5*in 


2.5e-27 


101.0 


I 


100-208 


1 AOQ 

loyo 


O+m 1 

/tm i 


*7 francmpmftranp rpf^pntor ^rhooon55in r 
/ uaiisiucuiuiaiiG icttpiui yiiiv/uvj^oiti j. 


2.6e-50 


149.5 


I 


41-290 


1 AOQ 

ioyo 


endotoxin in 


Hplta pnH AtoYin N- terminal domain 


0.87 


3.6 


I 


195-225 


1 AOQ 

ioyy 




QT? A AATYimn 
o L-iJx uuiiiaiii 


1.2e-06 


24.2 


I 


330-408 


1 AOO 

ioyy 


uuri/ io 


Rortpri^l nrotein of unknown function 


0.091 


7.2 


I 


550-576 


1 AQO 

ioyy 


Un-fr, fio 
rianta 


T-Tanto\/iriic c/1 vo AAr Atpin Cis 
ri all lav u Ub giyvupiuiviii vj*» 


0.027 


6.9 


I 


550-578 


1 AOO 

ioyy 


Y epnaase 


PpAtirlncp CI ^ fnrntl V 


0.32 


3.5 


I 


554-582 


1 1 OA 

1 1U0 


ig 


liiuxiuiiu^iuuiiiiii Liuuiaui 


0.0046 


14.3 


I 


146-203 


1 1 fiA 

1 1UO 


ig 


Tmmnnrtol AKiilin Homsin 


3.2e-07 


29.9 




245-295 


1 1 AA 
1 100 


rrllrllr 


FT4TPFP fnmilv 


0.21 


3.5 


I 


315-326 


1 1 A1 
I 101 




r piirmp rirK rpnpflt N-terminal domain 


0.00068 


15.2 


I 


23-49 i 


1 1 A1 
1 101 


T PP 


T pupinp RirVi Ppneat 


8.7e-05 


18.9 


I 


51-74 


1 1 A1 
1 1U1 


LlvJv 


F piioinp Rirn Reneat 

JLrCUV/lllG JlvIVIL IXupbOl 


0.00032 


17.0 


2 


75-98 


1 1 ni 

1 1U1 


T PP 


r j*nrinp Pirn Reneat 


0.025 


10.6 


3 


99-122 


1 1 A1 
1 1U1 


r rp 


T purine Rirh Reneat 


0.00069 


15.8 


4 


123-146 


1101 


LRR 


Leucine Rich Repeat 


9.9e-06 


22.1 


5 


147-170 


1 1 A1 

1101 


r DDPT 


T piiAinp Hph rpnpiit C-terminal domain 

JL'CUl'lllC llWll ICJjFvat V»"IV/111U11<*1 VLUUAULI-l 


2.3e-15 


48.2 




180-232 


1101 


ig 


Immunoglobulin domain 


1.3e-08 


35.1 . 


1 


248-307 


1 1 A 1 

1101 


ig 


immunogiouuiin aomain 


3.8e-09 


37.1 




344-400 


1101 


ig 


Immunoglobulin domain 


3.4e-05 


22.3 




440-490 


1101 


BON 


Transport-associated domain 


0.14 


7.1 




495-507 


1101 


ig 


Immunoglobulin domain 


3.1e-08 


33.7 




525-582 


1101 


pec lyase N 


Pectate lyase, N terminus 


0 19 


3.9 




626-632 


1101 


An_peroxidase 


Animal haem peroxidase 


9.8e- 


657.1 




726- 
1265 


1101 


PAL 


Phenylalanine and histidine ammonia-1 


0.53 


2.6 




993- 
1010 


1101 


7tm_l 


7 transmembrane receptor (rhodopsin f 


0.22 


2.7 


1 


1057- 
1065 


1101 


Peptidase_Cl 


Papain family cysteine protease 


0.76 


2.1 




1150- 
1167 


1101 


TILa 


TILa domain 


0.00018 


16.9 




1394- 
1433 


1101 


PSP94 


Beta-microseminoprotein (PSP-94) 


0.11 


8.0 




1395- 
1426 


1101 


vwc 


von Willebrand factor type C domain 


2e-10 


38.0 




1395- 
1450 
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TABLE 4B 



SEQ 
ID 


Modet 


Description 


1? vnlnp 




jvc uca i s 


Position 


1 1 AO 

1102 


LrKKlN 1 


.Leucine ncn repeal rN-icmiiu«ii uuinain 


0 00068 


15.2 


1 


23-49 


1102 


t t>r> 
LRR 


Leucine Rich Repeat 




18 9 


\ 


51-74 


1 1 AO 

1102 


TDD 


Leucine Kicn Kepeai 


0.021 


10.9 


2 


75-98 


1 1 AO 

1102 


TDD 


Leucine Kicn ivepeai 


0 00069 


15.8 


3 


99-122 


1102 


LRR 


Leucine Rich Repeat 


9.9e-06 


22.1 


4 


123-146 


1102 


LRRCT 


Leucine rich repeat C-terminal domain 


Z.JC- 1 J 


4R 9 




156-208 


1102 


ig 


Immunoglobulin domain 






1 


224-283 


1102 


ig 


Immunoglobulin domain 


3.8e-09 


37.1 




320-376 


1102 


ig 


Immunoglobulin domain 




99 1 

ZZ.O 




416-466 


1102 


BON 


Transport-associated domain 


0.14 


7.1 


i 


471-483 


1102 




Immunoglobulin domain 


J. le-uo 


Jj. / 






1102 


pec lyase N 


Pectate lyase, N terminus I 


0.19 


3.9 


i 


602-608 


1102 


An_peroxidase 


Animal haem peroxidase 


ft On 

195 


OS /.I 




709- 

1241 


1102 


PAL 


Phenylalanine and histidine ammonia-1 


ft C3 


z.o 




06Q-0R6 

707-700 


1102 


7tm_l 


7 transmembrane receptor (rhodopsin f 


0.22 


2.7 


i 

— 


1033- 
1041 


1102 


Peptidase_Cl 


Papain family cysteine protease 


U. /O 


O 1 
Z. 1 





1 196- 
1 143 


1102 


TILa 


TILa domain 


u.uuuio 


16 Q 
io.^ 




1370- 
1409 


1102 


PSP94 


Beta-microseminoprotein (PSP-94) 


0.11 


8.0 




1371- 
1402 


1102 


vwc 


von Willebrand factor type C domain 


2e-10 


38.0 


i 


1371- 
1426 


1103 


PMEI 


: J : 

Plant invertase/pectin methylesterase 




5 5 




2-24 


1103 


Ribosomal S26e 


Ribosomal protein S26e 


o 47 


T> 9 


-T 


215-236 


1103 


ATP-gua_Ptrans 


ATP:guanido phosphotransferase, C-ter 


0 ORQ 

U.UQ7 


6 1 




240-262 


1103 


Arch_fla_DE 


Archaeal flagella protein 


n 49 


S 0 




670-683 


1103 


zf_dskA_traR 


Prokaryotic aKsA/traK U4-type zinc n 


o 4R 


4 8 




1145- 
1160 


1103 


zf-C4 


Zinc tmger, type (two aomains; 


0 07 


7.6 




1147- 
1157 


1104 


UBX 


UBX domain 


0.79 


4.8 


j 


1-18 


1104 


FTCD C 


r ormimino rransierase-cyc i ouedmindb c 


0.21 


6.0 




47-77 


1104 


1KB 


iivio tamuy 


0.66 


0.9 




80-94 


1104 


Ribosomal L21p 


KiDOSomai proicaryotic lzi protein 


0 27 


5.0 




115-138 


1104 


DUF709 


Family of unknown function (DUF709) 


0.28 


6.5 


1 


215-227 


1105 


Torsin 


Torsin 


c ?p-07 
o.-£c-i/ / 


23.3 




106-125 


1106 


rrm 


RNA recognition motit. {a.K.a. kkm, is. 




8 5 




41-71 


1107 


MH1 


1 XTT1 rl m mm 1 

MH1 domain 


n 16 


5.4 




288-309 


1107 


Pecanex_C 


Pecanex protein (C-terminus) 


0 7f»- 
Z.Ze- 

190 
izy 


440 0 




437-621 


1107 


Pecanex C 


: — 77^ J \ 

Pecanex protein (C-terminus) 


4p-0R 

HC"wO 


29.6 




622-640 


1108 


MH1 


MH1 domain 


ft 16 

IS. 1U 


5.4 




288-309 


1 1 OR 


Po/>onpv f' 

reCttucA_v^ 


Pecanex nrotein ( C-terminus} 


1.2e- 
106 


364.5 


! 


437-599 


1110 


Herpes_UL14 


Herpesvirus UL14-like protein 


0.12 


6.1 




17-43 


1110 




Immunoglobulin domain 


0.57 


6.5 




36-55 


1110 


Ribosomal L4 


Ribosomal protein L4/L1 family 


0.95 


3.3 




79-107 


1110 


Vpu 


Vpu protein 


0.34 


4.7 




106-140 


1110 


BPD transp 


Binding-protein-dependent transport s 


0.9 


3.7 




109-137 


1111 


DUF895 


Eukaryotic protein of unknown functio 


0.68 


4.1 




133-149 


1112 


RWD 


RWD domain 


9.8e-40 


142.2 




11-125 
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TABLE 4B 



SEQ 
ID 


Model 


Description 


17 vqIho 




JVC U Ca Id 




1 1 io 
L 1 XL 


gioDin 


VJlODin 


0.048 


8.6 




88-120 


1112 


eRFl 2 


eRFl domain 2 


0.72 


4.1 




114-127 


1112 


Zt-Ujril^f 


z#inc linger, i^oni^** type ^ivirN\jr linger j 




77 Q 




n 5-201 

1JJ Z>t/ 1 


11 12 


U 


in ALz-aepenaent lijn/y ugase wt zinc 
fing 


0 ^7 


S 8 
J .o 


"j 


1 96-207 


1 1 1 o 

1112 


„r X/fT'7 


lvuZv zinc linger 




4 6 


"j 


197-207 


1 1 1 o 

i 1 lz 


ApOA-ll 


.rvpo lipoprotein /v-n \/\pu/v*ii^ 


0 94 


3.6 




261-272 


1113 


pkinase 


Protein kinase domain 


le-45 


162.0 




194-468 


1113 


Pox Mz 


Poxvirus M2 protein 


n 87 






JVJU-J J J 


1113 


DUF857 


Domain of unknown function 

(LJUroj / ) 


0.48 


4.7 


1 


417-428 


1113 


ExoD 


Exopolysaccharide synthesis, ExoD 




7 7 
z. / 


-j 




1113 


T/"C70 

KJbz ] 


isjiz ramiiy protein 




7 7 


-j 




111/4 

1114 


SNF2 N 


oJNrZ rami ly in -terminal oomain 


fl 0^7 


6 A 





R7-1 1 S 


1114 


Transposase_8 


Transposase 


n 

u.JO 


S 6 

J.O 


-j 


87-105 


1 1 1 A 

1114 


UivK_JJC_ 1 iN 


urn/L«ys/Arg oecarooxyiase, in- 

form i nil 

terminal 




1 6 

J.U 


-T 


91-102 


LUO 


L/Ur iuuo 


rroiein oi unKiiowii luiiuuuu 
mi IF 10061 

y^J-'wl/ l\J\JSJ) 


0 45 


2.5 




5-24 


1117 
III/ 


ig 


Immi innfr1r\V^i ll in flnm n t n 
lllllllUilUgLVJUUlill vlUillolll 


2.3e-05 


22.9 




30-87 


1117 
111/ 


ig 


llIUIlUUUglUlml.nl UUiliolll 


0.0023 


15.5 


2 


127-186 


1117 
111/ 


i? 


li 1 UUUglUUUl 111 UUIIlaLIl 


0 00079 


17.2 


3 


281-337 


1117 


ig 


Immunoglobulin domain 


0.026 


11.5 


4 


379-434 


ill/ 


CXTT77 

oiNr / 


QXJR7 

oiNr / 




3 5 
j . j 




435-450 


1118 


ig 


Immunoglobulin domain 


0.00079 


17.2 




42-98 


1110 

lllo 


ig 


Immunoglobulin domain 


n 076 


1 1 s 

1 1 . J 




140-195 


1 1 1 o 

1 118 


aNJP / 


OX7T7*7 

oJNr / 


n os 


T s 

J.J 




196-211 


1 1 1 O 


TTJXT XTT 

LdN N 1 


lmportin-oeta iN-terminai aomain 


1 6p-74 


O^. J 


-7 


28-100 


1119 


PurA 


PurA ssDNA and RNA-binding protein 


0.19 


4.8 


I 


155-171 


1119 


PAN 


r AN domain 


1 
1 


^ 7 
j.Z 




7ft6-7^S 


1120 


Bowman- 
Birkleg 


Bowman-Birk serine protease inhibitor 


1 
1 


4 ft 


-j 


28-36 


1120 


RNA_pol_Rpb2_ 

i 

1 


RNA polymerase beta subunit 


0.25 


2.1 




150-946 


1 1 Ofi 

1 1ZU 


cooW 


——j : — . m „ 

Cobalamin synthesis protein/r 4 / K. 


u.OJ 


2.3 




170-205 


1 lzU 


uuryuy 


rsactenai protein oi unicnown iuncuun 


0 16 

v. IU 


7 1 




215-247 


1 1 oa 
1 1/U 


Gl yco__hydro_2_ 


oiycosyi nyuroiases mnuy z, nivi uoi 


0.24 


4.8 




262-277 


1 17ft 
1 1ZU 


ank 


/vnKytin repeal 


1.2e-l0 


41.3 


1^ 


920-952 


1 19H 
1 1ZU 


ank 


A n l/~\/ri n f f»T\ACjt 

rviiKyriii repeal 


2.5e-08 


33.0 




953-985 


1 1 9ft 
1 1ZU 


QUI 


QUI Hnmain 


5.7e-16 


61.3 




1022- 
1079 


1 197 
1 1ZZ 


TPP 

1 rt\. 


TTPR Dnmciin 
1 1 IV L/ulllalil 


0.013 


12.3 


i 


138-157 


1 197 
1 1ZZ 


TPP 
1 rJ\ 


TPP r^rvmnin 


l.le-07 


30.1 




158-191 


1 197 
I 1ZZ 


TPP 
iris. 


TPP r^nmciin 
1 rl\ JL^UIIlalil 


0.29 


7.5 




192-222 


1122 


BEX 


Brain expressed X-linked like family 


0.25 


3.9 


! 


261-294 


1122 


eRFl 2 


eRFl domain 2 


0.12 


6.9 




322-338 


1122 


Subtilisin N 


Subtilisin N-terminal Region 


0.83 


5.1 




323-344 


1123 


Pencillinase R 


Penicillinase repressor 


0.85 


4.2 




57-75 


1124 


ank 


Ankyrin repeat 


1.8e-07 


29.9 




64-96 


1124 


ank 


Ankyrin repeat 


1.5e-06 


26.5 


2 


97-129 


1124 


ank 


Ankyrin repeat 


2e-07 


29.7 


3 


130-162 


1124 


Shigella OspC 


Shigella flexneri OspC protein 


0.51 


3.2 


1 


131-161 


1124 


ank 


Ankyrin repeat 


4.3e-06 


24.9 


4 


163-195 
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TABLE 4B 



SEQ 

ID 


Model 

— . 


Description 


E_value 


Score 


Repeats 


Position 


1 1 O/l 

1 124 


ank 


: : 

Ankynn repeat 


O.UUOlo 


19.2 


5 


1 A/C HQ 


1 IOC 

1 125 


ank: 


Ankyrin repeat 


1 O ~ iV7 

l.oe-u/ 


29.9 


1 
1 


o4-9o 


hoc 

1125 


ank 


Ankyrin repeat 


1.5e-06 


OZT C 

26.5 


2 


n*7 1 OA 

97-129 


1 1 OC 


ank 


Ankyrin repeat 


2e-07 


29.7 


3 


1 OA 1 iTO 

130-162 


1125 


ShigellaOspL 


Shigella flexneri OspC protein 


a c i 

0.51 


3.2 


1 


1 O 1 1 £. 1 

131-161 


1125 


ank 


Ankyrin repeat 


4.3e-06 


24.9 


4 


163-195 


1 1 OZT 

1126 


DUF846 


Eukaryotic protein of unknown functio 


0.5 


2.6 


l 


50-74 


1129 


Apolipoprotein 


Apolipoprotein A1/A4/E family 


0.95 


3.4 


1 


4-28 


1129 


F5 F8_type_C 


F5/8 type C domain 


le-63 


195.2 


1 


34-174 


1129 


lammm G 


Laminin G domain 


2.7e-10 


36.3 


1 


212-344 


1130 


Apolipoprotein 


Apolipoprotein A1/A4/E family 


0.95 


3.4 


1 


4-28 


1130 


F5 F8 type_C 


F5/8 type C domain 


le-63 


195.2 


1 


34-174 


1130 


laminin G 


Laminin G domain 


6.5e-U 


38.5 


1 


212-344 


1130 


laminin G 


Laminin G domain 


1.8e-ll 


40.4 


2 


398-525 


1130 


EGF 


EGF-like domain 


l.le-06 


26.8 


I 


551-583 


1130 


fibrinogen_C 


Fibrinogen beta and gamma chains, C-t 


0.051 


6.6 


1 


601-634 


1130 


lamininG 


Laminin G domain 


2.7e-17 


60.5 


3 


821-943 


1130 


EGF 


EGF-like domain 


0.0014 


15.7 


2 


962-996 


1130 


laminin_G 


Laminin G domain 


0.00033 


15.3 


4 


1046- 
1179 


1130 


DNAJPPF ! 


DNA polymerase processivity factor 


0.69 


4.3 


1 


1059- 
1078 


1130 


BenE 


Benzoate membrane transport protein 


0.29 


3.8 


1 


1239- 
1255 ! 


1130 


BPDJransp 


Binding-protein-dependent transport s 


0.41 


5.0 


1 


1245- 
1276 


1131 


HTH_9 


N-terminal HTH domain of 
molybdenum-b 


0.72 


4.3 


1 


61-84 


1 ill 

1131 


Glycos_transf_2 


Glycosyl transferase 


o o n 1 

2.2e-31 


1 AC A 

105.9 


1 


1 C C 1 At 

155-341 


1131 


Ribosomal_S3_C 


Ribosomal protein S3, C-terminal 
doma 


0.98 


o a 

3.4 


1 


OCT ~) CI 

357-363 


1111 
1131 


Kicin b lectin 


(jXW lectin repeat 


a 1 1 


Q A 
5.4 




— — 


40 /-4y0 


1 1 i 1 
1131 


Ricin Jo lectin 


yXW lectin repeat 


a a Ami 
U.UUU /3 


lo.5 







1 1 70 

1 132 


bnterotoxin Ho 


Heat-stable enterotoxin 


A 71 
U./l 


1.4 


~ 


17 /II 

3/-43 


1 1 11 
1 132 


Vor 


Giardia variant-specific surface prot 


A 11 

U.23 


/i l 
4.3 


1 


yo-i/o i 


1 1 n 
1 1 JZ 


isp_i 


Thrombospondin type 1 domain 


U.Z/ 


a a 


^- 


I/IO OA1 


1 1 11 
1 133 


p kinase 


Protein kinase domain 


1 lo /1Q 


171 £ 

1 / l.o 




1 1 117 
1 1-Z5 / 


1 133 


Pox ser-thr kin 


Poxvirus serine/threonine protein kin 


A 1 

U.2 


A Z 

4.5 


~T 


1 11 1 SA. 
133-1 JO 


1 133 


p kinase 


Protein kinase domain 


A AAA1 *7 


10.2 


-~ 

~ 


111 1/17 


1 1 1C 
1 135 


/^i 


C2 domain 


O /la 1A 

2.4e-3U 


1A1 Q 

iu3.y 


— 


7 CQ 


1 1 1^ 
1 135 


1 ransposase_24 


t\1 j i . _ . .. _ — rTti 1 .-, /T? 1 1 /O linn C n n.i'lii 

riant transposase (rtta/hn/opm iamiiy 


A 11 






/ll z& 
42-50 


1 1 o c 

1135 


photoRC 


Photosynthetic reaction centre protei 


A AC 


1 0 
1.0 




/!< X7 

45-0/ 


1 IOC 

1 135 


<J2 


C2 domain 


oe-32 


i no a 


— 


1 1< IK 

135-210 


1 1 1C 

1135 


RasGAP 


GTPase-activator protein for Ras-like 


C /1« TA 

5.4e-39 


124.6 




HI C 1 1 

323-513 


1 lie 

1135 


AraCbindmg 


AraC-like ligand binding domain 


A /lO 
0.42 


5.2 




A\ A A CO 

414-452 




rrt 


rri oomain 


0 P 1 1 

o.oe-i i 







*\f\l £71 


1135 


BTK 


BTK motif 


1.9e-06 


26.9 




675-711 


1137 


toxin 2 


Scorpion short toxin 


0.089 


6.2 




51-77 


1137 


C tripleX 


Cysteine rich repeat 


9e-05 


15.9 




54-71 


1137 


EGF 


EGF-like domain 


0.00049 


17.3 




60-86 


1137 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.55 


5.3 




75-88 


1137 


EGF 


EGF-like domain 


0.00015 


19.2 


2 


123-155 


1137 


TIL 


Trypsin Inhibitor like cysteine rich 


0.55 


4.1 


2 


142-163 


1137 


EGF 


EGF-like domain 


0.00018 


18.9 


3 


163-197 
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1137 


TIL 


Trypsin Inhibitor like cysteine rich 


0.0065 


10.1 


3 


180-203 


1137 


EGF 


EGF-like domain 


0.031 


10.8 


4 


215-232 


1137 


EGF 


EGF-like domain 


3.7e-07 


28.6 


5 


248-283 


1137 


EB 


EB module 


0.73 


4.1 


1 


254-283 


1137 


PRJC 


Phosphoribulokinase / Uridine kinase 


0.74 


2.8 


1 


407-426 


1137 


MAM 


MAM domain 


3.1e-27 


100.7 


1 


452-593 


1137 


Omptin 


Omptin family 


0.99 


2.0 


1 


460-476 


1138 


SurE 


Survival protein SurE 


0.68 


2.6 




10-23 


1138 


Pox All 


Poxvirus Al 1 Protein 


0.17 


3.2 


i 


57-75 


1138 


zf-C2H2 


Zinc finger, C2H2 type 


0.55 


8.5 


1 


205-228 


1138 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.64 


2.6 


1 


205-210 


1139 


4HBT 


Thioesterase superfamily 


0.038 


8.2 


1 


52-120 


1143 


7tm 1 


7 transmembrane receptor (rhodopsin f 


1.4e-28 


84.3 | 


1 


1-173 


1144 


7tm 1 


7 transmembrane receptor (rhodopsin f 


4.5e-49 


145.7 


1 


40-287 


1146 


SNARE 


SNARE domain 


0.28 


7.0 


1 


195-229 


1147 


IL1 


Intedeukin-1 / 18 


2.6e-23 


83.4 


1 


51-152 


1148 


filament 


Intermediate filament protein 


3.2e- 
106 


363.0 


1 


1-296 


1148 


K-box 


K-box region 


0.78 


4.1 


1 


14-31 


1148 


Ribosomal S4 


Ribosomal protein S4/S9 N-terminal do 


0.85 


4.3 


1 


76-97 


1148 


IATP 


Mitochondrial ATPase inhibitor, IATP 


0.46 


6.1 


1 . 


125-148 


1148 


ERG2 SigmalR 


ERG2 and Sigmal receptor like protein 


0.36 


3.7 


1 


162-191 


1148 


filament 


Intermediate filament protein 


4.8e-31 


113.3 


2 


380-469 


1148 


K-box 


K-box region 


0.11 


7.0 


2 


397-415 


1148 


bZIP 


1/2 51 88.. 28 65 


0.3 


6.3 


2 


435-472 


1148 


Tfb2 


Transcription factor Tfb2 


0.17 


0.5 


1 


466-472 


1149 


Phage X 


Phage X family 


0.71 


4.2 




16-41 


1149 


20G-FeII_Oxy 


20G-Fe(II) oxygenase superfamily 


0.27 


6.0 




229-307 


1150 


MBOAT 


MBOAT family 


2.3e-08 


30.9 


1 


90-249 


1151 


filament 


Intermediate filament protein 


1.2e-38 


138.5 


1 


131-242 


1151 


filament 


Intermediate filament protein 


2.9e-83 


286.8 




244-412 


1151 


HSP70 


Hsp70 protein 


0.99 


2.0 


1 


268-294 


1151 


HAMP 


HAMP domain 


1 


4.8 


J 


301-334 


1151 


DUF164 


Uncharacterized ACR, COG1579 


0.057 


7.3 




310-352 


1151 


bZIP 


bZIP transcription factor 


0.062 


8.7 




316-348 


1151 


Transposase 8 


Transposase 


0.79 


4.5 




317-335 


1151 


MutS V 


MutS domain V 


0.27 


4.5 


1 


354-370 


1151 


OEP 


Outer membrane efflux protein 


0.053 


7.0 


1 


356-393 


1151 


MutS IV 


MutS family domain IVi 


0.9 


4.6 




359-392 


1151 


Hpt 


Hpt domain 


0.49 


5.2 




365-389 


1151 


Retro M 


Retroviral M domain 


0.5 


4.2 




369-377 


1152 


Peptidase M10 
N 


Matrix metalloprotease, N-terminal do 


le-42 


119.7 


1 


12-95 


1152 


PG binding_l 


Putative peptidoglycan binding domain 


0.022 


10.3 


1 


60-90 


1152 


Peptidase_M10 


Matrixin 


8.7e-51 


178.9 


1 


102-206 


1152 


hemopexin 


Hemopexin 


1.6e-08 


30.8 




Oil 111 


1152 


hemopexin 


Hemopexin 


4.4e-ll 


39.4 


2 


275-317 


1152 


hemopexin 


Hemopexin 


2.2e-13 


47.1 


3 


322-367 


1152 


hemopexin 


Hemopexin 


2e-05 


20.4 


4 


371-411 


1153 


Peptidase M10 
N 


Matrix metalloprotease, N-terminal do 


le-42 


119.7 


1 


12-95 


1153 


PG binding_l 


Putative peptidoglycan binding domain 


0.022 


10.3 


1 


60-90 


1153 


Peptidase_M10 


Matrixin 


8.7e-51 


178.9 


1 


102-206 


1153 


hemopexin 


Hemopexin 


1.6e-08 


30.8 


1 


231-273 
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1153 


hemopexin 


Hemopexin 


4.4e-ll 


39.4 


2 


275-317 


1153 


hemopexin 


Hemopexin 


2.2e-13 


47.1 


3 


322-367 


1153 


hemopexin 


Hemopexin 


2e-05 


20.4 


4 


371-411 


1154 


SUFU 


Suppressor of fused protein (SUFU) 


0 


1218. 
3 


1 


3-484 


1155 


LBP BPI CETP 


LBP / BPI / CETP family, N-terminal d 


l.5e-61 


210.5 


I 


38-217 


1155 


LBP BPI_CETP 
C 


LBP / BPI / CETP family, C-terminal d 


4.6e-32 


115.6 


1 


242-478 


1156 


HMG box 


HMG (high mobility group) box 


5.8e-32 


115.2 


! 


85-153 


1156 


HEV ORF2 


Hepatitis E virus ORF-2 (Putative cap 


0.026 


8.5 




336-356 


1159 


zf-PARP 


Poly(ADP-ribose) polymerase and 
DNA-L 


3.5e-52 


183.5 


i 


93-185 


1159 


DNA ligase_A_ 
N 


DNA ligase N terminus 


5.9e-13 


42.4 


i 


319-433 


1159 


DNA ligase 


ATP dependent DNA ligase domain 


8.5e-74 


255.3 




480-636 


1159 


mRNA_cap_enzy 
me 


mRNA capping enzyme, catalytic 
domain 


0.00064 


7.5 


1 


594-613 


1160 


serpin 


Serpin (serine protease inhibitor) 


9.5e- 
151 


511.0 


i 


1-425 


1166 


Peptidase C14 


Caspase domain 


2.4e-06 


23.6 




7-40 


1167 


ig 


Immunoglobulin domain 


2e-05 


23.2 




42-96 


1167 




Immunoglobulin domain 


0.0012 


16.6 




135-197 


1167 


ig 


Immunoglobulin domain 


0.0013 


16.4 




237-297 


1168 


UK 


Virulence determinant 


0.083 


7.0 




14-38 


1168 


TIP120 


TBP (TATA-binding protein) -interacti 


0 


2347. 
3 




25-908 


1168 


HEAT 


HEAT repeat 


0.093 


8.3 


2 


248-286 


1168 


HEAT 


HEAT repeat 


0.022 


10.4 


3 


343-364 


1168 


Armadillo seg 


Armadillo/beta-catenin-like repeat 


0.2 


8.0 


2 


682-721 


1169 


lectin c 


Lectin C-type domain 


7.4e-19 


72.8 


1 


131-231 


1171 


WD40 


WD domain, G-beta repeat 


8.5e-ll 


40.4 


1 


223-260 


1171 


WD40 


WD domain, G-beta repeat 


2.8e-06 


24.7 


2 


280-316 


1171 


WD40 


WD domain, G-beta repeat 


9e-09 


33.4 


3 


320-357 


1171 


WD40 


WD domain, G-beta repeat 


0.0041 


13.7 


4 


362-398 


1171 


WD40 


WD domain, G-beta repeat 


3.1e-14 


52.4 


5 


403-440 


1171 


WD40 


WD domain, G-beta repeat 


4.3e-12 


45.0 


6 


445-491 


1171 


WD40 


WD domain, G-beta repeat 


l.le-14 


54.0 


7 


496-533 


1171 


WD40 


WD domain, G-beta repeat 


0.23 


7.6 


8 


538-574 


1172 


ifi 


Immunoglobulin domain 


2.3e-05 


22.9 


1 


42-99 


1172 


ig 


Immunoglobulin domain 


0.0023 


15.5 


2 


139-198 


1172 


MBOAT 


MBOAT family 


3.1e-07 


26.8 




741-769 


1173 


ig 


Immunoglobulin domain 


2.3e-05 


22.9 




42-99 


1173 


ig 


Immunoglobulin domain 


0.0023 


15.5 




139-198 


1173 


MBOAT 


MBOAT family 


3.1e-07 


26.8 




741-769 


1174 


MBOAT 


MBOAT family 


2.3e-08 


30.9 


i 


90-249 


1174 


MBOAT 


MBOAT family 


4.7e-09 


33.5 




3Uo-3 j 1 


1175 


PTE 


Phosphotriesterase family 


1.2e- 
139 


474.0 




7-233 


1187 


Pox int trans 


Poxvirus intermediate transcription fac 


0.092 


5.7 




94-122 


1183 


PS Dcarbxylase 


Phosphatidylserine decarboxylase 


0.011 


11.4 




165-181 


1183 


PS Dcarbxylase 


Phosphatidylserine decarboxylase 


8.3e-52 


182.3 




246-467 


1184 


TSC22 


TSC-22/dip/bun family 


5e-47 


146.4 




124-183 


1186 


ADP_PFK_GK 


ADP-speciflc 

Phosphofructokinase/Glucokin 


7e-227 


763.9 




68-492 
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1186 


Mannitol dh 


Mannitol dehydrogenase 


0.052 


6.8 


1 


394-413 


1187 


DENN 


DENN (AEX-3) domain 


0.054 


6.0 


1 


16-40 


1188 


DPPIV N term 


Dinentidvl nentidase IV CDPP IV) N- 
termi 


0.5 


1.1 


1 


310-346 


1188 


DPPIV N term 


Dinentidvl oeotidase IV CDPP IV) N- 
termi 


2.4e-07 


19.7 


2 


5 13-608 


1188 


DPPIV N term 


Dinentidvl Dentidase IV CDPP IV) N- 
termi 


5.3e-08 


21.7 


3 


646-680 


1188 


Peptidase S9 


Prolvl olieooentidase familv 


3.9e-ll 


36.8 


1 


692-764 


1188 


Esterase 


Putative esterase 


0.062 


6.6 


1 


738-781 




DPP TV N term 


Dinpnttfivl npnHHa<;e TV (DPP IV) N- 

termi 


0.5 


1.1 


1 


310-346 


1 1 oy 


DPP TV 1ST term 
urn v ix iciui 


ninentiHvl npntiHa<;p TV (DPP IV) N- 

termi 


2.4e-07 


19.7 


2 


513-608 


1189 


DPPIV N term 


Dinentidvl nentidase IV CDPP IV) N- 
termi 


53e-08 


21.7 


3 


646-680 


1189 


Peptidase S9 


Prolyl oligopeptidase family 


3.9e-ll 


36.8 | 


I 


692-764 


1189 


Esterase 


Putative esterase 


0.062 


6.6 


1 


738-781 


1190 


DPPIV_N_term 


Dipeptidyl peptidase IV (DPP IV) N- 
termi 


0.5 


1.1 


1 


310-346 


1190 


DPPIV_N_term 


Dipeptidyl peptidase IV (DPP IV) N- 
termi 


5.3e-08 


21.7 


2 


633-667 


1190 


Peptidase_S9 


Prolyl oligopeptidase family 


3.9e-ll 


36.8 


1 


679-751 


1190 


Esterase 


Putative esterase 


0.062 


6.6 


1 


725-768 


1191 


Ribosomal S25 


S25 ribosomal protein 


6.7e-67 


232.4 


1 


2-100 


1191 


DUF387 


Putative transcriptional regulators (Yp 


0.099 


5.8 


1 


65-87 


1193 


ank 


Ankyrin repeat 


6.7e-10 


38.6 


2 


49-81 


1193 


ank 


Ankyrin repeat 


2.7e-08 


32.8 


3 


82-114 


1193 


ank 


Ankyrin repeat 


0.0036 


14.4 


4 


115-147 


1193 


ank 


Ankyrin repeat 


1.5e-ll 


44.5 


5 


148-180 


1193 


ank 


Ankyrin repeat 


1.2e-08 


34.1 


6 


181-213 


1193 


ank 


Ankyrin repeat 

£ E 


3.3e-08 


32.5 


7 


214-246 


1193 


ank 


Ankyrin repeat 


3.4e-ll 


43.2 


8 


247-279 


1193 


ank 


Ankyrin repeat 


1.3e-08 


33.9 


9 


280-313 


1193 


ank 


Ankyrin repeat 


0.0027 


14.9 


10 


314-346 


1193 


ank 


Ankyrin repeat 


8.5e-08 


31.1 


11 


347-379 


1193 


ank 


Ankyrin repeat 


0.013 


12.4 j 


12 


380-404 


1193 


ank 


Ankyrin repeat 


8.3e-08 


31.1 


13 


431-463 


1193 


ank 


Ankyrin repeat 


l.le-09 


37.8 


14 


464-496 


1193 


ank 


Ankyrin repeat 


6.9e-07 


27.8 


15 


497-557 


1193 


endonuclease 7 


Recombination endonuclease VII 


0.034 


9.6 


1 


513-537 


1193 


ank 


Ankyrin repeat 


0.0047 


14.0 


16 


558-581 


1193 


ank 


Ankyrin repeat 


1.2e-05 


23.3 


17 


596-625 


1193 


ank 


Ankyrin repeat 


3.8e-10 


39.4 


18 


626-658 


1193 


ank 


Ankyrin repeat 


0.00034 


18.1 


19 


660-692 


1193 


ank 


Ankyrin repeat 


1.5e-09 


37.3 


20 


696-728 


1193 


ank 


Ankyrin repeat 


1.5e-05 


23.0 


21 


729-761 


1193 


ank 


Ankyrin repeat 


9.7e-05 


20.1 


22 


762-784 


1193 


ank 


Ankyrin repeat 


2.3e-07 


29.5 


23 


798-821 


1193 


ank 


Ankyrin repeat 


0.0023 


15.1 


24 


830-853 


1193 


ank 


Ankyrin repeat 


3.6e-10 


39.5 


25 


865-897 


1193 


ank 


Ankyrin repeat 


4.3e-06 


24.9 


26 


898-931 


1193 


ank 


Ankyrin repeat 


0.00019 


19.1 


27 


932-964 


1193 


ank 


Ankyrin repeat 


1.6e-07 


30.0 


28 


968- 
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1000 


1194 


trypsin 


Trypsin 


l.JC'^J 


72 8 




166-342 


1194 


PDZ 


PDZ domain (Also known as DHR or 


0.0018 


14.0 


1 


372-412 


1195 


7tmJ 


7 transmembrane receptor (rhodopsin 
family) 


2.4e-l8 


53.6 




1-137 


1196 


vwc 


von wnieDrano ractor type uomdin 


5 4e-05 


19.0 




66-105 


1196 


vwc 


von WuleDrana iacior type w uornain 


U.1U 


6 8 




108-163 


1196 


vwc 


von wiiieDrana racior type o aomam 




27.4 




166-192 


1197 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


4.5e-36 


106.7 




46-295 


1198 


MethyltransiDlz 


Liiz class iNo aaenine-speciiic 
met 


Z.. 1C JU 


127.8 




30-152 


i i aa 

1199 


— : ~ 

lipocahn 


Lipocaun / cyiosouc iau.y-aL.iu uiuumg 
pr 


1.8e-21 


80.6 




32-176 


1 O AA 

1ZU0 




ad fWIH nnr»lpir» ariH hinHint* domain 


2.7e-15 


56.4 




44-118 


1 1 A A 
lZUU 


xT) XT A n , m f O 

tKiNA-syni_z 


livlN rV Synillclabca Cldoo 11 ^ auu 


2.7e-91 


313.5 


J 


135-473 


1 O AO 

IZUZ 


DA n kinrlinrr 0 

rAD Dinamg z 


r oinunig uuiiiaiii 


5.9e-54 


182.8 


! 


5-100 


1203 


RasGEFN 


Guanine nucleotide exchange factor for 

Dqc.1 


0.00072 


15.7 


1 


39-87 


1 OA? 


Kasorir 


RacfrPF Hnmain 


0.39 


5.5 


1 


211-240 


1 OA"} 


Kasvjiir 


IvaoVJCX UUlllalll 


6.8e-18 


69.6 




280-360 


1 OA/1 


ivxl 


JVO. UUlllalll 


3.8e-17 


61.6 


! 


17-63 


1 OA/1 

IZU4 


IV- TT 

Jvri 


ISJn UUlllalll 


5.4e-19 


68.0 




101-150 


1 OAA 


ivrl 


JSJuL UUIIlalil 


5.8e-l6 


57.6 




265-313 


i oa/; 
IZUo 


trans Ket_pyr 


TVanclrptnlcjcp nvriHinp hinHino' nomai 
1 rdUblvClUlaoC, uy HUllic uuiuuig uumai 


3.3e-75 


258.0 


l 


15-190 


1 OA£ 

IZUO 


transKcioiase 


TVancVptnlacp f-tprminal H Amain 


2e-59 


194.9 




208-331 


1 OAO 


Calsequestnn 


/^r»l ppni loctrtn 

i«ai5Gquca in n 


6.3e- 
294 


986.6 




5-390 


1907 
iZv / 


imuicu 


Thiorpdoxin 


0.057 


9.0 


I 


123-152 


1209 


PH 


PH domain 


0.0057 


11.0 


1 


70-97 


1010 
1Z 1U 


ig 


Tmmiinno'1 ohiilin domain 


4.9e-07 


29.2 




35-112 


1 o i n 

IZ lu 


ig 


TmiYiiin ncrl AAiiIin fiomain 

1111UIU11U£1UUUILU UvlUCUil 


2e-06 


26.9 




154-228 


191^ 
IZ 1 j 


CdUIlCI III 


f^adhprin domain 


0.00026 


16.8 


I 


33-96 


lO 1 a 
iZU 




Cadherin domain 


1.3e-06 


24.8 




143-235 


1213 


cadherin 


Cadherin domain 


1.3e-22 


80.6 


3 


249-343 


1 O 1 1 


cadherin 


v^aUllCllll UUlllalll 


2.9e-14 


51.4 




361-448 


1 O 1 1 

1ZL3 


cadherin 


v^aunenn uumain 


L7e-22 


80.2 




462-558 


1 O 1 0 


cadherin 


l^duntJllIl UUlllalll 


2.4e-10 


37.8 




597-667 


1 O 1 A 

1214 


calreticulin 


L/aireucuiin iduiuy 


3.6e- 
221 


715.1 


! 


21-315 


1218 


Alpha L fucos 


Alpha-L-fucosidase 


0.018 


8.4 


1 


10-34 


1011 

1221 


Osteopontin 


Osteopontin 


1.4e-20 


64.7 




1-30 


1221 


Osteopontin 


Osteopontin 


2e-166 


531.5 




31-275 


1222 


serpin 


Serpin (serine protease inhibitor) 


156 


529 6 




78-443 


1223 


»g 


Immunoglobulin domain 


0.00035 


18.5 




31-101 


1223 


ig 


2/3 143 210.. 7 52 


1.3e-08 


35.1 




252-303 


1225 


HATPase c 


Histidine kinase-, DNA gyrase B-, and 


3.8e-15 


54.5 




16-164 


1225 


DNA_gyraseB 


DNA gyrase B 


4.1e-57 


199.9 




210-370 


1225 


DNAJopoisolV 


DNA gyrase/topoisomerase IV, subunit 


i.3e- 
189 


610.1 




653- 
1120 


1225 


DLTF188 


Uncharacterized BCR, Yail^qxD 
family 


0.025 


8.2 


i 


1095- 
1121 1 
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IZZO 


A N/fP-hindin.P' 


AMP-hinHine enzvrne 


7.3e-84 


288.8 


1 


105-539 


1 1ll 
LZZ / 


PPT 


PPT rlrvmain 


8.7e-10 


36.4 


1 


46-101 


1 11% 
IZZo 


P1 n 
k./ It] 


fin rlnmain 


6e-45 


159.5 


1 


73-202 


1 HQ 

i zzy 


RTR 

DID 


RTR/P07 domain 


9.8e-14 


51.0 


1 


9-69 


lZZy 


RTR 
DID 


RTR/PD7 domain 


0.68 


4.5 


2 


70-101 


1 Olfl 
IZjU 


ank 


Anlrvrin rpnPflt 


0.93 


5.8 


I 


7-39 


1 11(\ 


ank 


Anlrvrin reneat 


0.00039 


17.9 


2 


40-85 


IzjU 


auk 


Anlrvrin rpnPilt 
xVi iJv y 1 1 1 1 1 &u cai 


2.1e-05 


22.4 


3 


86-147 


IzjU 


ank 


A nlrvrin rpnPflt 


0.057 


10.1 


4 


148-180 


1 OlA 

lzJU 


ank 


Anlrvrin rprtP5it 

rvnK.ynn rcpv<u 


3.6e-10 


39.5 


5 


181-213 


1 T3f\ 

lzJU 


ank 


Anlrvrin rpnPJit 


7.2e-08 


31.3 


6 


214-246 


1231) 


ank 




3.7e-06 


25.1 


7 


247-279 


IzJU 


ank 


/\UK.yLUl lC{JCal 


1.7e-08 


33.6 


8 


280-312 




ank 


Anlrvrin rprwat 


4.9e-07 


28.3 


9 


313-346 


IZju 


ank 


Anlrvrin rpnpat 


0.00014 


19.6 


10 


347-379 


1 T\C\ 


ank 


Anlrvrin rpnpat 


l.8e-07 


29.9 


11 


380^12 


IZJU 


ank 


Anlrvrin rpnpnt 


0.038 


10.8 


12 


413-437 


1Z_>U 


ank 


Anlcvrin reneat 


2.5e-08 


32.9 


13 


464-496 


1 11C\ 


ank 


Anlrvrin rpnpfit 


7.6e-08 


31.2 


14 


497-529 


1 7^fi 


ank 


Anlrvrin rpneat 


2.2e-07 


29.5 


15 


530-590 


1 11C\ 


ank 


Anlrvrin rpneat 


0.0048 


14.0 


16 


591-613 


IZJU 


ank 


Anlrvrin reneat 


0.0097 


12.9 


17 


629-658 


IZjv 


ank 


Anlrvrin rpnpflt 


3.3e-06 


25.3 


18 


659-691 


1230 


ank 


Ankyrin repeat 


2.3e-05 


22.3 


19 


693-727 


IOTA 

lziU 


ank 


Amvynn repeal 


3.1e-09 


36.2 


20 


729-761 


1230 


ank 


Ankyrin repeat 


0.00054 


17.4 


21 


762-794 


1 Tin 


ank 


A.nKyrin repeat 


5.7e-06 


24.5 


22 


795-827 


1230 


ank 


Ankyrin repeat 


9.6e-05 


20.1 


23 


832-855 


IOTA 

1230 


ante 


Ankyrin repeat 


0.0013 


16.0 


24 


864-892 


1230 


ank 


Ankyrin repeat 


1.7e-08 


33.5 


25 


899-931 


1230 


ank 


Ankyrin repeat 


3,4e-06 


25.3 


26 


932-965 


1230 


ank 


Ankyrin repeat 


0.001 


16.4 


27 


966-998 


1230 


ank 


Ankyrin repeat 


Z. JC**U / 


29 3 


28 


1006- 
1034 


1231 


lbp_bpmjetp 


Lr>r / rJrl / ob ir iamuy, r>i -terminal 
do 


1 ^P.Al 


210 5 




38-217 


1231 


LBP BPI CETP 
C ~ 


LtJr / tiri i Ld 1 r iamuy, v^-ienrunai 
do 


7 ^p-97 


Q6 9 




242-472 


1232 


LBP_BPI_CETP 


LBP / BPI / Cblr iamuy, N-terminai 

uO 


i.De-oi 






38-217 


1232 


LBP_BPMJETP 
C 


Lor / or l / oti i " iamuy, i^-rernunai 

GO 




92 4 




242-472 


1233 


t nn DDT PCTD 

LDr_Dr l_Cb 1 r 


T DD / DDT / PThTP familv XJ ff^rminol 

Lor / Drl / L/C ir idinuy, iN-iermind.i 
CO 


l.5e-61 


210.5 




38-217 


1233 


LBP BPI CETP 
C 


LBP / BPI / CETP family, C-terminal 
do 


2.6e-33 


120.1 


1 


242-478 


1234 


DUF408 


Domain of Unknown Function 
(DUF408) 


7.8e- 
114 


388.3 




41-222 


1237 


ig 


Immunoglobulin domain 


5.3e-05 


21.6 




28-86 


1237 




Immunoglobulin domain 


2e-08 


34.4 




127-184 


1237 


'? 


Immunoglobulin domain 


6.2e-13 


51.3 




219-277 


1237 


fh3 


Fibronectin type III domain 


6.3e-20 


71.0 




299-385 


1237 


fh3 


Fibronectin type III domain 


8e-10 


35.9 




396-481 


1238 


Nuf2 


Nuf2 family 


3.3e- 


356.3 




1-148 
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104 








1238 


HR1 r 


Hrl repeat 


0.099 


7.1 


1 


187-214 


1240 


Sema 


Sema domain 


7,6e- 
178 


601.0 


1 


59-477 


1240 


squash 


Squash family serine protease inhibitor 


0.033 


5.8 


1 


512-534 


1240 


PSI 


Plexin repeat 


0.00097 


13.3 


1 


514-543 


1240 


UreE_C 


UreE urease accessory protein, C- 
terminal do 


0.09 


7.9 
i 


1 


809-830 


1243 


rrm 


RNA recognition motif. (a.k.a. RRM, 
RBD, or 


0.0012 


15.1 


1 


29-93 


1247 


Peptidase M50 


Peptidase family M50 • 


0.0024 


12.4 


1 


9-875 


1247 


C tripleX 


Cysteine rich repeat 


0.00046 


13.8 


1 


103-120 


1247 


EGF 


EGF-like domain 


0.089 


9.2 


1 


105-135 


1247 


EGF 


EGF-Iike domain 


0.0047 


13.8 


2 


148-178 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.0006 


15.5 


1 


152-195 


1247 


EB 


EB module 


0.21 


5.7 


2 


188-217 


1247 


EGF 


EGF-like domain 


3.2e-06 


25.2 


3 


191-221 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.17 


7.0 


2 


199-238 


1247 


EGF 


EGF-like domain 


8.6e-06 


23.6 


4 


234-264 


1247 


DSL 


Delta serrate ligand 


0.024 


8.9 


4 


250-264 


1247 


EGF 


EGF-like domain 


5.5e-06 


24.3 


5 


277-307 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


1.2e-05 


21.3 


4 


281-317 


1247 


EGF 


EGF-like domain 


0.00028 


18.2 


6 


320-350 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.018 


10.3 


5 


324-351 


1247 


DSL 


Delta serrate Hgand 


0.15 


6.3 


6 ! 


353-364 


1247 


EGF 


EGF-like domain 


0.053 


10.0 


7 


364-396 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.00011 


18.0 


6 


368-406 


1247 


DSL 


Delta serrate ligand 


0.034 


8.4 


7 


383-396 


1247 


DSL 


Delta serrate ligand 


0.76 


4.0 


8 


397-409 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.68 


4.9 


7 


413-456 


1247 


EGF 


EGF-like domain 


4.8e-05 


20.9 


9 


415-439 


1247 


DSL 


Delta serrate ligand 


0.46 


4.7 


9 


425-439 


1247 


EGF 


EGF-like domain 


1.2e-05 


23.1 


10 


452-482 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.19 


6.9 


8 


460-499 


1247 


DSL 


Delta serrate ligand 


0.21 


5.8 


10 


469-482 


1247 


EB 


EB module 


0.89 


3.8 


4 


492-525 


1247 


EGF 


EGF-like domain 


2.3e-05 


22.1 


11 


495-525 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.011 


11.1 


9 


502-542 


1247 


DSL 


Delta serrate ligand 


0.023 


9.0 


11 


512-525 


1247 


EGF 


EGF-like domain 


0.03 


10.8 


12 


538-568 


1247 


laminin EGF 


Laminin EGF-like (Domains HI and V) 


0.0055 


12.1 


10 


546-587 


1247 


DSL 


Delta serrate ligand 


0.0012 


13.1 


12 


553-568 


1247 


EGF 


EGF-like domain 


0.0001 


19.8 


13 


581-611 


1247 


EB 


EB module 


0.025 


8.5 


5 


587-611 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.065 


8.4 


11 


589-631 


1247 


DSL 


Delta serrate ligand 


0.5 


4.6 


14 


612-624 


1247 


EGF 


EGF-like domain 


0.48 


6.5 


14 


614-624 


1247 


EGF 


EGF-like domain 


0.041 


10.4 


15 


631-669 


1247 


laminin EGF 


Laminin EGF-like (Domains HI and V) 


0.00028 


16.6 


12 


634-658 


1247 


EGF 


EGF-like domain 


0.00023 


18.5 


16 


675-699 


1247 


DSL 


15/20 647 656.. 58 67 


0.15 


6.3 


16 


689-699 


1247 


EGF 


EGF-like domain 


7.7e-05 


20.2 


17 


712-742 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.56 


5.2 


13 


716-752 


1247 


DSL 


15/20 647 656.. 58 67 


0.083 


7.1 


17 


729-742 
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1247 


EGF 


EGF-like domain 


0.57 


6.2 


18 


745-755 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


3.4e-05 


19.8 


14 


759-802 


1247 


EB 


EB module 


0.079 


7.0 


6 


760-785 


1247 


EGF 


EGF-like domain 


0.0048 


13.7 


19 


761-785 


1247 


DSL 


15/20 647 656.. 58 67 


0.16 


6.2 


19 


772-785 


1247 


EGF 


EGF-like domain 


0.0057 


13.5 


20 


798-828 


1247 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.0035 


12.8 


15 


805-830 


1247 


DSL 


15/20 647 656.. 58 67 


0.44 


4.8 


20 


809-828 


1249 


IBR 


IBR domain 


Lle-05 


19.4 


I 


74-104 


1249 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.029 


6.4 


1 


114-134 


1249 


IBR 


IBR domain 


0.029 


8.4 


2 


132-164 


1250 


NC 


NC domain 


2.2e-47 


167.3 


I 


172-253 


1251 


Aa trans 


Transmembrane amino acid transporter 
protein 


1.9e-77 


.267.5 


1 


52-365 


1252 


Aa_trans 


Transmembrane amino acid transporter 
protein 


2.5e-76 


263.7 


1 


45-354 


1252 


Aa_trans 


Transmembrane amino acid transporter 
protein 


2.7e-06 


22.9 


2 


355-419 


1254 


FGF 


Fibroblast growth factor 


2.2e-40 


137.9 


1 


42-166 


1255 


LRR 


Leucine Rich Repeat 


0.033 


10.2 


1 


49-70 


1255 


LRR 


Leucine Rich Repeat 


0.21 


7.5 


2 


71-92 


1255 


LRR 


Leucine Rich Repeat 


0.57 


6.0 _j 


3 


94-115 


1255 


LRR 


Leucine Rich Repeat 


0.46 


6.3 


4 


116-140 


1256 


RPE65 


Retinal pigment epithelial membrane 
protein 


8.9e-59 


199.4 


1 


60-416 


1256 


RPE65 


Retinal pigment epithelial membrane 
protein 


4.3e-27 


91.2 


2 


462-579 


1257 


RPE65 


Retinal pigment epithelial membrane 
protein 


8.9e-59 


199.4 


1 


42-398 


1257 


RPE65 


Retinal pigment epithelial membrane 
protein 


4.3e-27 


91.2 


2 


444-561 


1258 


ig 


Immunoglobulin domain 


0.001 


16.8 


1 


39-97 


1258 




Immunoglobulin domain 


1.2e-ll 


46.5 


2 


128-189 


1260 


DUF948 


Bacterial protein of unknown function 
(DU 


0.058 


8.0 


1 


249-269 


1261 


serpin 


Serpin (serine protease inhibitor) 


3.2e-08 


27.5 


1 


31-82 


1261 


serpin 


Serpin (serine protease inhibitor) 


1.2e-60 


206.3 


2 


212-423 


1262 


PMP22 Claudin 


PMP-22/EMP/MP20/Claudin family 


3.7e-16 


56.5 


1 


1-47 


1263 


arf 


ADP-ribosylation factor family 


5.1e-13 


43.2 


1 


10-132 


1264 


PAP2 


PAP2 superfamily 


4.5e-15 


54.4 


1 


106-241 


1265 


SRCR 


Scavenger receptor cysteine-rich doma 


2e-20 


73.6 


1 


37-128 


1265 


SRCR 


Scavenger receptor cysteine-rich doma 


6e-28 


100.2 


2 


136-227 


1265 


SRCR 


Scavenger receptor cysteine-rich doma 


6.6e-33 


117.8 


3 


232-329 


1265 


Arthro defensin 


Arthropod defensin 


0.0097 


7.3 


1 


340-364 


1265 


SRCR 


Scavenger receptor cysteine-rich doma 


3.le-15 


55.3 


4 


360-459 


1265 


SRCR 


Scavenger receptor cysteine-rich doma 


7.6e-33 


117.6 


5 


477-574 


1266 


SRCR 


Scavenger receptor cysteine-rich doma 


2e-20 


73.6 


1 


37-128 


1266 


SRCR 


Scavenger receptor cysteine-rich doma 


6e-28 


100.2 


2 


136-227 


1266 


SRCR 


Scavenger receptor cysteine-rich doma 


6.6e-33 


117.8 


3 


232-329 


1266 


Arthro defensin 


Arthropod defensin 


0.0097 


7.3 


1 


340-364 


1266 


SRCR 


Scavenger receptor cysteine-rich doma 


3.1e-15 


55.3 


4 


360-459 


1266 


SRCR 


Scavenger receptor cysteine-rich doma 


7.6e-33 


117.6 


5 


477-574 


1270 


Armadilloseg 


Armadillo/beta-catenin-like repeat 


2.7e-05 


21.8 


1 


53-93 


1270 


Armadillo seg 


2/5 546 586.. 1 41 


0.11 


9.0 


5 


691-716 
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1971 
IZ I J 


Pnpnmn att (1 


Pnpnmnvirinae attachment membrane 
p\ vcon 


0.098 


4.7 


1 


57-70 


1971 


nlrina^P 
ptuiuxat* 


Protein kinase domain 

J. 1 UVvilL fvll 1UOV UvlilUlll 


3e-77 


266.8 


I 


103-387 


1975 
IZ / J 


Ppn M 1 9 R nron 

ep 


Rpnmlvsin familv nroneotidc 


3.2e-37 


116.5 




97-215 


1 975 

IZ /-/ 


R f>nrr\1 vqi n 


Renrolvsin fM12B^ familvzinc metallo 


l.le-88 


304.8 




227-426 


1975 


PpntiHa^p M46 


Pregnancy-associated plasma protein-A 


0.056 


5.5 


I 


362-372 


1 975 
IZ /») 


rlicintporin 


Di^intporin 

LXLalillVgl ill 


1.7e-39 


134.2 




443-518 


1975 
IZ / J 


FOF 


FOP-lilce domain 


0.0023 


14.8 


I 


670-697 


1976 


FpoA 


FpoA familv 

l wun. L<xi tiny 


0.088 


8.4 




132-239 


1977 
iz / / 


anlf 
all IV. 


Ankvrin reneat 


2.3e-05 


22.3 




301-339 


1 977 

LZ / / 


an lr 
aim 


Ankvrin rpneat 


9.5e-ll 


41.6 




340-373 


1977 


OpVivHrata^p Til 


Dehydratase large subunit 


0.015 I 


7.6 


I 


369-403 


197R 
1 z /o 


Pf»nHHaQP N/fl 


Ppntidasp familv Ivll 


7.1e- 
137 


383.8 




98-506 


1284 


A a tran<? 

rva liana 


Transmembrane amino acid transporter 


2.4e-30 . 


110.9 


I 


4-397 




ARPF 


Aromatic-Rich Protein Family 


4.3e-09 


31.3 


I 


74-190 


1288 


LRR 


Leucine Rich Repeat 


0.41 


6.5 




66-89 


1288 


LRR 


Leucine Rich Repeat 


0.0017 


14.6 


2 


90-113 


1288 


LRR 


Leucine Rich Repeat 


0.76 


5.6 


3 


114-137 


1288 


LRR 


Leucine Rich Repeat 


0.0013 


14.9 


4 


138-161 


1288 


LRR 


Leucine Rich Repeat 


0.0043 


13.1 


5 


163-186 


1288 


LRR 


Leucine Rich Repeat 


0.0088 


12.1 


6 


187-210 


1288 


LRR 


Leucine Rich Repeat 


0.063 


9.2 


7 


211-231 


1288 


LRRCT 


Leucine rich repeat C-terminal domain 


2.6e-10 


32.7 


1 


252-297 






Immunoglobulin domain 


5.8e-09 


36.4 


1 


314-372 


1989 


Hnntinofin 


Hunrinetin 


0.077 


5.4 


1 


768-790 


1290 


LRRNT 


Leucine rich repeat N-terminal domain 


0.0011 


14.5 


1 


32-59 


1290 


LRR 


Leucine Rich Repeat 


0.0059 


12.7 


1 


61-84 


1290 


LRR 


T eucine Rich Reoeat 


0.00021 


17.6 


2 


85-108 


1290 


LRR 


Leucine Rich Repeat 


0.012 


11.6 


3 


110-132 


1290 


LRRCT 


Leucine rich repeat C-terminal domain 


0.00014 


15.2 


1 


131-144 


1291 


PH 


PH domain 


0.053 


7.8 


1 


7-98 


1291 


DAGKc 


Diacylglycerol kinase catalytic domain 
( ores 


0.00081 


14.7 


1 


90-177 


1292 


lg 


Immunoglobulin domain 


0.069 


9.9 


1 


48-120 


1292 


iff 


Immunoglobulin domain 


8.1e-09 


35.9 


2 


161-219 


129^ 


l & 


Immunoglobulin domain 


0.069 


9.9 


1 


48-120 


1991 


iff 


Immunoglobulin domain 


8.1e-09 


35.9 


2 


161-219 


1995 


P1n 


fin domain 

\> ivj uviniaiii 


5.3e-49 


173.0 


1 


72-198 


1296 


7tm_l 


7 transmembrane receptor (rhodopsin 

famil 

lCUllll 


1.3e-08 


24.4 


I 


49-108 


1 9Q£ 


7tm 1 


7 francmpmbrnnp rpppntnr Trhodon^in 

/ U Ol loll ICUIUl CU1C lWC^JlUt ^lil\JUVJJ^OHl 

famil 


4.6e-3l 


91.7 


2 


109-332 


1297 


MED7 


MED7 protein 


0.0099 


9.5 


1 


202-242 


1297 


CH 


Calponin homology (CH) domain 


2.7e-3l 


114.2 


1 


215-316 


1297 


CH 


Calponin homology (CH) domain 


3.7e-26 


97.1 


2 


331-433 


1297 


UVR 


UvrB/uvrC motif 


0.0066 


12.8 


1 


652-664 


1297 


spectrin 


Spectrin repeat 


0.007 


11.5 


1 


793-852 


1297 


ACCA 


Acetyl co-enzyme A carboxylase 
carboxy 


0.017 


10.3 


1 


832-873 


1297 


spectrin 


Spectrin repeat 


4.9e-05 


18.9 


2 


922-973 


1297 


PoiC DP2 


DNA polymerase II large subunit DP2 


0.013 


2.0 


I 


928-939 


1297 


DUF622 


Protein of unknown function, DUF622 


0.043 


9.8 


1 


1313- 



WO 2004/080148 



PCT/US2003/030720 



549 
TABLE 4B 



SEQ 
ID 


Model 


Description 


E_value 


Score 


Repeats 


Position 














1341 


1297 


Myc-LZ 


Myc leucine zipper domain 


0.13 


7.7 


2 


1313- 
1338 


1297 


spectrin 


Spectrin repeat 


0.38 


5.5 


3 


1486- 
1512 


1297 


bZIP 


1/3 644 674.. 35 65 


0.058 


8.8 


3 


1698- 
1722 


1297 


Prefoldin 


Prefoldin subunit 


0.56 


5.2 


3 


1709- 
1736 


1297 


M 


M protein repeat 


0.44 


8.1 


2 


1939- 
1959 


1207 


ldh C 


lactate/malate dehydrogenase, alpha/be 


0.35 


5.2 


2 


2093- 
2118 


1297 


FTCD C 


Formiminotransferase-cyclodeaminase 


0.029 


9.2 


1 


2108- 
2146 


1297 


Lam in in II 


Laminin Domain II 


0.032 


9.5 


1 


2152- 
2219 


1297 


Tropomyosin 


Tropomyosin 


0.019 


8.9 


1 


2210- 
2251 


1297 


Pox A typejnc 


2/7 1057 1069.. 1 13 


0.47 


6.6 


6 


2364- 
2379 


1297 


Tropomyosin 


Tropomyosin 


0.72 


3.2 


2 


2396- 
2425 


1297 


Pox A type inc 


2/7 1057 1069.. 1 13 


0.57 


6.3 


7 


2399- 
2421 


1297 


Plectin 


Plectin repeat 


le-19 


74.9 


2 


2734- 
2778 


1297 


Plectin 


Plectin repeat 


8.3e-16 


60.6 


3 


2808- 
2852 


1297 


CBMJ4 


Chitin binding Peritrophin- A domain 


0.0038 


11.3 


1 


2867- 
2884 


1297 


Plectin 


Plectin repeat 


2e-05 


22.8 


4 


2907- 
2939 


1297 


Plectin 


Plectin repeat 


0.018 


12.0 


6 


3012- 
3042 


1297 


Plectin 


Plectin repeat 


2.1e-20 


77.4 


7 


3043- 
3087 


1297 


ECH 


Enoyl-CoA hydratase/isomerase family 


0.00096 


14.0 


1 


3059- 
3080 


1297 


Plectin 


Plectin repeat 


0.083 


9.6 


8 


3088- 
3118 


1297 


Plectin 


Plectin repeat 


l.3e-16 


63.5 


9 


3119- 
3163 


1297 


Plectin 


Plectin repeat 


0.44 


6.9 


10 


3169- 
3201 


1298 


MED7 


MED7 protein 


0.0099 


9.5 


1 


202-242 


1298 


CH 


Calponin homology (CH) domain 




IU/.o 


i 
i 


215-328 


1298 


CH 


Calponin homology (CH) domain 


3.7e-26 


97.1 


2 


343-445 


1298 


UVR 


UvrB/uvrC motif 


0.0066 


12.8 


1 


664-676 


1298 


spectrin 


Spectrin repeat 


0.007 


11.5 


1 


805-864 


1298 


ACCA 


Acetyl co-enzyme A carboxylase 
carboxy 


0.017 


10.3 


I 


844-885 


1298 


spectrin 


Spectrin repeat 


4.9e-05 


18.9 


2 


934-985 


1298 


PolC DP2 


DNA polymerase II large subunit DP2 


0.013 


2.0 


1 


940-951 


1298 


DUF622 


Protein of unknown function, DUF622 


0.043 


9.8 


1 


1325- 
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1353 


1298 


Myc-LZ 


Myc leucine zipper domain 


0.13 


7.7 


2 


1325- 
1350 


1298 


sDectrin 

<J pvw VI lit 


Spectrin repeat 


0.38 


5.5 


3 


1498- 
1524 


1298 


bZIP 


1/3 656 686.. 35 65 


0.058 


8.8 


3 


1710- 
1734 


1298 


Prefoldin 


Prefoldin subunit 


0.56 


5.2 


3 


1721- 
1748 


1298 


M 


M protein repeat 


0.44 


8.1 


2 


1951- 
1971 


1298 


Idh C 


lactate/malate dehydrogenase, alpha/be 


0.35 


5.2 


2 


2105- 
2130 


1298 


FTCD C 


Formiminotransferase-cyclodeaminase 


0.029 


9.2 


1 


2120- 
2158 


1298 


r ami n in II 

.LvUll 1111111 


Laminin Domain II 


0.032 


9.5 


1 


2164- 
2231 


1298 


Trooomvosin 


Tropomyosin 


0.019 


8.9 


1 


2222- 
2263 


1298 


Pox A type inc 


2/7 1069 1081 .. 1 13 


0.47 


6.6 


6 


2376- 
2391 


1298 


TroDomvosin 


Tropomyosin 


0.72 


3.2 


2 


2408- 
2437 


1298 


Pox A type inc 


2/7 1069 1081.. 1 13 


0.57 


6.3 


7 


2411- 
2433 


1298 


Plectin 


Plectin repeat 


le-19 


74.9 


2 


2746- 
2790 


1298 


Plectin 


Plectin repeat 


8.3e-16 


60.6 


3 


2820- 
2864 


1298 


CBM_14 


Chitin binding Peritrophin-A domain 


0.0038 


11.3 


1 


2879- 
2896 


1298 


Plectin 


Plectin repeat 


2e-05 


22.8 


4 


2919- 
2951 


1298 


Plectin 


Plectin repeat 


0.018 


12.0 


6 


3024- 
3054 


1298 


Plectin 


Plectin repeat 


2.le-20 


77.4 


7 


3055- 
3099 


1298 


ECH 


Enoyl-CoA hydratase/isomerase family 


0.00096 


14.0 


1 


3071- 
3092 


1298 


Plectin 


Plectin repeat 


0.083 


9.6 


8 


3100- 
3130 


1298 


Plectin 


Plectin repeat 


1.3e-16 


63.5 


9 


3131- 
3175 


1298 


Plectin 


Plectin repeat 


0.44 


6.9 


10 


3181- 
3213 


1304 


DUF544 


Protein of unknown function (DUF544) 


5.8e-80 


275.8 


1 


157-282 


1305 


DUF544 


Protein of unknown function (DUF544) 


5.8e-80 


275.8 


I 


272-397 


1306 


ig 


Immunoglobulin domain 


2.2e-08 


34.3 


1 


26-93 


1306 


ig 


Immunoglobulin domain 


2.5e-06 


26.5 


2 


132-191 


1306 


MAM 


MAM domain 


6.9e-72 


249.0 


1 


422-595 


1308 


APH 


Phosphotransferase enzyme family 


2.9e-42 


150.6 


I 


40-256 


1308 


Acyl-CoA_dh__M 


Acyl-CoA dehydrogenase, middle 
domain 


0.00024 


17,0 


1 


505-585 


1308 


Acyl-CoA_dh 


Acyl-CoA dehydrogenase, C-terminal 
doma 


6.7e-50 


175.9 


1 


618-769 
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APH 


Phosphotransferase enzyme family 


7.7e-32 


116.0 


1 


80-238 


iJv/ 


Acyl-CoA_dh_M 


Acyl-CoA dehydrogenase, middle 
domain 


0.00024 


17.0 


1 


487-567 


1309 


Acyl-CoA_dh 


Acyl-CoA dehydrogenase, C-terminal 
doma 


6.7e-50 


175.9 


1 


600-751 


1310 


Cation efflux 


Cation efflux family 


3e-09 


34.4 


\ 


69-145 




CaMBD 


Calmodulin binding domain 


0.074 


7.8 




716-732 


1311 


IQ 


IQ calmodulin-binding motif 


1.3e-05 


22.1 


\ 


738-758 


1119 




SAM domain (Sterile aloha motif) 


0.00073 


15.4 




304-369 


1312 


SAM 


SAM domain (Sterile alpha motif) 


2.2e-10 


38.2 




382-446 


1119 


Q A \A 


SAM domain (Sterile aloha motif) 


0.06 


8.7 




470-499 


1111 




7inc fineer C3HC4 tvoe (RING finger) 


7.1e-25 


70.3 




80-126 


1111 




Hemes vim*» IJL49 5 
enveloDe/teffument Dr 


0.082 


7.4 


I 


147-170 


1 11 A 

1 J l*T 


m TF6Q9 


Protein of unknown function (DUF692) 


0.088 


6.2 




1703- 
1722 


1314 


HECT 


HECT-domain (ubiquitin-transferase) 


1.9e- 
196 


662.8 


1 


2002- 
2309 


1314 


V-ATPase_C 


V- ATPase subunit C 


0.032 


7.2 


1 


2185- 
2213 


1315 


PAP2 


PAP2 superfamily 


1.5e-25 


91.6 


1 


66-218 


1316 


PAP2 


PAP2 superfamily 


5.1e-30 


107.5 


1 


98-236 


1317 


ig 


Immunoglobulin domain 


8.1e-09 


35.9 


1 


41-116 


1321 


LRRNT 


Leucine rich repeat N-terminal domain 


1.3e-06 


24.2 


1 


115-143 


1321 


LRR 


Leucine Rich Repeat 


0.098 


8.6 


1 


145-168 


1321 


LRR 


Leucine Rich Repeat 


8.2e-06 


22.3 


2 


169-194 


1321 


FNIP 


FNIP Repeat 


0.36 


6.7 


1 


195-225 


1321 


LRR 


Leucine Rich Repeat 


0.63 


5.9 


3 


195-207 


1321 


LRR 


Leucine Rich Repeat 


0.0026 


13.9 


4 


240-265 


1321 


LRR 


Leucine Rich Repeat 


0.018 


11.1 


5 


266-285 


1321 


LRR 


Leucine Rich Repeat 


0.00014 


18.1 


6 


287-310 


1321 


LRR 


Leucine Rich Repeat 


0.00013 


18.3 


7 


311-336 


1321 


LRR 


Leucine Rich Repeat 


0.00015 


18.1 


8 


337-356 


1321 


LRR 


Leucine Rich Repeat 


0.22 


7.4 


9 


358-381 


1321 


LRR 


Leucine Rich Repeat 


0.002 


14.3 


10 


382-407 


1321 


LRR 


Leucine Rich Repeat 


0.022 


10.7 


11 


408-427 


1321 


LRR 


Leucine Rich Repeat 


0.00025 


17.3 


12 


453-478 


1321 


LRR 


Leucine Rich Repeat 


0.00049 


16.4 


13 


479-498 


1321 


LRR 


Leucine Rich Repeat 


0.13 


8.1 


15 


524-549 


1321 


LRR 


Leucine Rich Repeat 


0.00025 


17.3 


16 


550-569 


1321 


LRR 


Leucine Rich Repeat 


5.2e-05 


19.6 


17 


571-594 


1321 


LRR 


Leucine Rich Repeat 


0.37 


6.6 


18 


595-620 


1322 


ie 


Immunoglobulin domain 


0.26 


7.8 


1 


50-117 


1322 


ie 


Immunoglobulin domain 


0.00049 


18.0 


2 


157-215 


1322 


ig 


Immunoglobulin domain 


2.8e-09 


37.6 


3 


267-321 


1323 


ig 


Immunoglobulin domain 


0.24 


7,9 


1 


50-117 


1323 


ig 


Immunoglobulin domain 


0.00049 


18.0 


2 


157-215 


1323 


ig 


Immunoglobulin domain 


0.00077 


17.2 


3 


267-303 


1324 


tsp 1 


Thrombospondin type I domain 


2.9e-07 


25.9 


1 


37-81 


1325 


Guanylin 


Guanylin precursor 


0.00035 


9.9 


1 


1-24 


1325 


Apo-Cil 


Apolipoprotein C-II 


9.ie-43 


152.3 


1 


23-99 


1326 


Guanylin 


Guanylin precursor 


0.00035 


9.9 


1 


1-24 


1326 


Apo-CII 


Apolipoprotein C-II 


9.1e-43 


152.3 


1 


23-99 


1328 


SRCR 


Scavenger receptor cysteine-rich 


6.5e-37 


131.9 


1 


14-111 
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domain 














Scavenffer receotor cvsteinc-rich 
domain 


L2e-34 


123.9 


2 


188-285 


1^9R 


^RTR 


Scavenger receptor cysteine-rich 
domain 


4.7e-37 


132.4 


3 


300-397 


H9R 




Scavenger receptor cysteine-rich 
domain 


1.5e-35 


127.1 


4 


405-503 


1 ^9R 


nr IF1 ^9 


Uncharacterised ACR, COG2135 


0.092 


3.7 


1 


565-587 


1328 


SRCR 


Scavenger receptor cysteine-rich 
domain 


1.8e-27 


98.6 


5 


638-729 




ig 


Immunoglobulin domain 


0.81 


5.9 


1 


37-84 


1 ^9Q 




Immunoglobulin domain 


0.051 


10.4 


2 


113-165 


mi 




EF hand 


0.025 


12.1 


1 


12-40 


ijj i 


Cillallu 


EF hand 


0.97 


6.2 


2 


59-76 




/»fVjsinH 


EF hand 


0.041 


11.2 


3 


85-113 




wnt 


wnt familv 

TT 111 liUllU J 


6.9e- 
240 


694.6 


I 


40-365 




7tm 1 

/ till _ A 


7 transmembrane receptor (rhodopsin 
family) 


8.8e-18 


51.9 


1 


8-75 


1336 


SAP 


SAP domain 


3.8e-07 


29.1 


1 


11-45 


1336 


zf-MIZ 


MIZ zinc finger 


4.1e-41 


120.1 


I 


323-375 


1337 


FA desaturase 


Fatty acid desaturase 


1.2e-76 


264.7 


1 


71-296 


1338 


cystatin 


Cystatin domain 


0.074 


6.5 


1 


25-45 


1340 


actin 


Actin 


5.4e-67 


221.4 


1 


4-362 


1340 


El N 


El Protein, N terminal domain 


0.08 


6.5 


I 


149-158 


1341 


ion trans 


Ion transport protein 


0.007 


10.9 


1 


114-168 


1341 


ion trans 


Ion transport protein 


5e-05 


18.6 


2 


211-302 


1343 


iff 


Immunoglobulin domain 


6.1e-06 


25.1 


1 


124-182 


1343 


iff 


Immunoglobulin domain 


2.2e-06 


26.8 


2 


224-281 


1343 


iff 


Immunoglobulin domain 


7.6e-08 


32.2 


3 


316-372 


1343 


fn3 


Fibronectin type III domain 


2.8e-16 


58.3 


1 


394-480 


1343 


fh3 


Fibronectin type III domain 


6.6e-17 


60.5 


2 


492-578 


1343 


fh3 


Fibronectin type III domain 


0.013 


10.8 


3 


598-654 


1344 


DUF84 


Protein of unknown function DUF84 


0.098 


5.9 


1 


8-22 


1344 


iff 


Immunoglobulin domain 


3e-07 


30.0 


1 


53-110 


1344 


iff 


Immunoglobulin domain 


1.8e-07 


30.9 


2 


150-216 


1344 


iff 


Immunoglobulin domain 


2.9e-08 


33.8 


3 


255-310 


1344 


iff 


Immunoglobulin domain 


4.6e-07 


29.3 


4 


350-417 


1344 


iff 


Immunoglobulin domain 


l.le-07 


31.6 


5 


456-516 




iff 


Immunoglobulin domain 


8.8e-05 


20.8 


6 


553-617 


\-\AA 

lJ*rt 


MAM 


MAM domain 


6.7e-77 


265.6 


1 


753-918 


H45 


kazal 


Kazal-type serine protease inhibitor 
domain 


7.7e-06 


25.8 


1 


121-168 


1345 


ie 


Immunoglobulin domain 


l.2e-06 


27.7 


1 


186-255 


1346 


RNA helicase 


RNA helicase 


0.031 


7.9 


I 


82-109 


1346 


ATP-bind 


Conserved hypothetical ATP binding pr 


0.055 


7.3 


1 


87-100 


1348 


ig 


Immunoglobulin domain 


8.5e-07 


28.3 


1 


61-120 


1348 


i? 


Immunoglobulin domain 


0.00026 


19.0 


2 


155-214 


1348 


ig 


Immunoglobulin domain 


4.7e-08 


33.0 


3 


258-315 


1348 


ig 


Immunoglobulin domain 


2.3e-05 


23.0 


4 


348-404 


1348 


i? 


Immunoglobulin domain 


4.6e-09 


36.8 


5 


440-497 


1348 


i? 


Immunoglobulin domain 


8.8e-07 


28.3 


6 


530-596 


1348 


fh3 


Fibronectin type III domain 


5.2e-20 


71.3 


1 


615-704 


1348 


fn3 


Fibronectin type III domain 


0.0015 


14.1 


2 


717-807 
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1348 


fh3 


Fibronectin type III domain 


8.9e-14 


49.6 


3 


819-907 


1348 


fh3 


Fibronectin type III domain 


0.00019 


17.2 


4 


919- 
1002 


1350 


serpin 


Serpin (serine protease inhibitor) 


5.1e- 
197 


664.7 


1 


45-378 


1350 


serpin 


Serpin (serine protease inhibitor) 


8e-09 


29.6 


2 


379-402 


1352 


DREV 


DREV methyltransferase 


7.3e- 
233 


680.7 


1 


56-317 


1353 


CARD 


Caspase recruitment domain 


2.6e-33 


119.8 


1 


2-91 


1355 


ank 


Ankyrin repeat 


l.8e-07 


29.9 


2 


64-96 


1355 


ank 


Ankyrin repeat 


1.6e-06 


26.4 


3. 


97-129 


1355 


ank 


Ankyrin repeat 


3.8e-07 


28.7 


4 


130-162 


1355 


ank 


Ankyrin repeat 


0.00011 


19.9 


5 


163-195 


1355 


ank 


Ankyrin repeat 


0.00012 


19.8 


6 


196-228 


1356 


pkinase 


Protein kinase domain 


3.5e-64 


223.4 


I 


221-479 


1356 


Aldolase 


KDPG and KHG aldolase 


0.038 


7.4 


1 


868-891 


1357 


pkinase 


Protein kinase domain 


2.8e-05 


18.9 


1 


43-72 


1357 


Aldolase 


KDPG and KHG aldolase 


0.038 


7.4 


1 


461-484 


1358 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


2.6e-13 


38.4 


1 


1-59 


1359 


tRNA-synt_l 


tRNA synthetases class I (I, L, M and 
Y) 


0.00037 


12.8 


1 


53-115 


1359 


tRNA-synt_le 


tRNA synthetases class I (C) 


0.0002 


14.0 


1 


345-375 


1359 


tRNA-synM 


tRNA synthetases class I (I, L, M and 
V) 


2.4e-07 


23.7 


2 


345-383 


1360 


MHC II beta 


Class II histocompatibility antigen, beta 


1.4e-43 


149.3 


1 


42-117 


1363 


ig 


Immunoglobulin domain 


0.86 


5.8 


1 


12-69 


1363 


ig 


Immunoglobulin domain 


0.17 


8.4 


2 


139-200 


1363 




Immunoglobulin domain 


0.00066 


17.5 * 


3 


236-294 


1363 


*g 


Immunoglobulin domain 


7.9e-06 


24.7 


4 


344-398 


1364 


fn3 


Fibronectin type III domain 


0.0032 


12.9 


1 


35-125 


1365 


IL1 


Interleukinrl / 18 


5.4e-31 


110.6 


1 


11-155 


1366 


A2M_N 


Alpha-2-macroglobulin family N- 
terminal regi 


1.5e-92 


317.7 


1 


6-613 


1366 


A2M 


Alpha-2-macroglobulin family 


3.6e- 
211 


711.7 


1 


722- 
1449 


1367 


ABC membrane 


ABC transporter transmembrane region 


1.7e-07 


28.5 


1 


1-70 


1368 


UPAR LY6 


u-PAR/Ly-6 domain 


2.6e-37 


134.1 


1 


27-106 


1967 


DUF99 


Protein of unknown function DUF99 


0.06 


5.8 


1 


3-26 


1967 


hormone 


Somatotropin hormone family 


1.6e-55 


156.0 


1 


29-141 


1968 


DUF99 


Protein of unknown function DUF99 


0.06 


5.8 


1 


3-26 


1968 


hormone 


Somatotropin hormone family 


1.6e-55 


156.0 


1 


29-141 


1969 


DUF99 


Protein of unknown function DUF99 


0.06 


5.8 


1 


3-26 


1969 


hormone 


Somatotropin hormone family 


1.6e-55 


156.0 


1 


29-141 


1970 


DUF99 


Protein of unknown function DUF99 


0.06 


5.8 


1 


3-26 


1970 


hormone 


Somatotropin hormone family 


1.6e-55 


156.0 


1 


29-141 


1971 


serpin 


Serpin (serine protease inhibitor) 


5.1e-83 


282.6 


1 


83-449 


1972 


PI-PLC-X 


Phosphatidylinositol-specific 
phospholipase 


3.8e-14 


50.6 


1 


1-33 


1973 


Lipase 3 


Lipase (class 3) 


1.7e-17 


62.0 


1 


399-538 


1976 


DUF846 


Eukaryotic protein of unknown 
function (DUF8 


0.0091 


7.9 


1 


79-109 


1977 


Monooxygenase 


Monooxygenase 


3.8e-12 


44.1 


1 


215-313 


1977 


Monooxygenase 


Monooxygenase 


l.7e-15 


56.1 


2 


358-443 
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Score 
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L70V 


it All *• 


ANl-like Zinc finger 


0.032 


10.1 


1 


59-98 


l ya\j 


zf-ANl 


AN 1 -like Zinc finger 


9.2e-06 


22.6 


2 


149-181 


1QR1 

l 70 1 


PR AT TRIO 


CRAL/TRIO domain 


0.037 


7.8 


1 


10-38 




Rlmmhnid 


Rhomboid family 


3.9e-32 


116.9 


1 


128-282 




r rp BPI CETP 


LBP / BPI / CETP family, N-termmal 
do 


4.5e-38 


130.4 


1 


33-191 


1984 


LBP BPI CETP 
C 


LBP / BPI / CETP family, C-terminal 
do 


8.3e-14 


49.9 


1 


253-456 


1987 


DUF572 


Family of unknown function (DUF572) 


3.5e-37 


133.7 


1 


1-61 


1987 


DUF572 


Family of unknown function (DUF572) 


5e-23 


84.4 


2 


91-149 


1988 

1 700 


Collaeen 


Collagen triple helix repeat (20 copi 


3.7e-ll 


44.2 


1 


1-51 


1988 


Collacen 


Collagen triple helix repeat (20 copi 


6.6e-ll 


43.2 


2 


60-115 


1988 


Collaeen ! 


Collagen triple helix repeat (20 copi 


3.9e-13 


51.6 


3 


116-175 


1988 


Onllaeen 


Collagen triple helix repeat (20 copi 


0.0069 


13.1 


4 


178-195 


1988 


Collagen 


Collagen triple helix repeat (20 copi 


0.0001 


20.0 


5 


199-230 


1988 


Collaeen 


Collagen triple helix repeat (20 copi 


4e-09 


36.5 


6 


239-298 


1988 


Collaeen 


Collagen triple helix repeat (20 copi 


1.9e-I3 


52.8 


7 


302-355 


1988 


Collagen 


Collagen triple helix repeat (20 copi 


7.1e-06 


24.3 


8 


362-395 


1988 


Collagen 


Collagen triple helix repeat (20 copi 


0.0012 


16.0 


9 


396-444 


1988 


C4 


C-terminal tandem repeated domain in 


2e-69 


240.8 


1 


450-557 


1988 

1700 


C4 


C-terminal tandem repeated domain in 


1.3e-77 


268.0 


2 


558-672 


1989 

1707 


Idl recept b 


Low-density lipoprotein receptor repeat 


7.3e-10 


34.9 


1 


56-97 


1989 

1707 


Idl recent b 


Low-density lipoprotein receptor repeat 


2.7e-07 


26.4 


2 


99-141 


1989 


1H1 recent b 


Low-density lipoprotein receptor repeat 


3.2e-07 


26.2 


3 


143-185 


1990 


ldl recept_b 


Low-density lipoprotein receptor repeat 


7.3e-10 


34.9 


1 


56-97 


10Q0 


1H1 rfppnt h 


Low-density lipoprotein receptor repeat 


2.7e-07 


26.4 


2 


99-141 


lyyv 


1 rll rprent H 


Low-density lipoprotein receptor repeat 


3.2e-07 


26.2 


3 


143-185 


1QQ1 
lyyi 


DTJF846 


Eukaryotic protein of unknown 
function (DUF8 


0.00016 


13.3 


1 


76-106 


1992 


cadherin 


Cadherin domain 


2.1e-10 


38.0 


1 


9-105 


1992 


cadherin 


Cadherin domain 


1.4e-28 


101.4 


2 


119-210 


1993 

iyyj 


cadherin 


Cadherin domain 


2.1e-10 


38.0 


1 


9-105 


1993 


cadherin 


Cadherin domain 


1.4e-28 


101.4 


2 


119-210 


1995 


V1R 


Vomeronasal organ pheromone 
receptor family, 


3.8e-08 


27.0 


1 


4-36 


1998 




Immunoglobulin domain 


2.le-09 


38.1 


1 


18-76 


1998 


i? 


Immunoglobulin domain 


7.9e-09 


35.9 


2 


121-179 


1998 

I770 


iff 


Immunoglobulin domain 


0.00014 


20.0 


3 


216-274 


1998 


ifi 


Immunoglobulin domain 


7.1e-09 


36.1 


4 


308-366 


1QQ8 
Yyyo 


ig 


Immunoglobulin domain 


1.7e-10 


42.2 


5 


403-461 


1 QQQ 
iyyy 


or iv i 


SPRY domain 


1.8e-30 


107.5 


1 


148-277 


1 OOO 

iyyy 




SRP54-tvne nrotein, GTPase domain 


0.0091 


11.6 


1 


310-325 


1999 


AAA 


ATPase family associated with various 
cellul 


0.098 


5.8 


1 


313-325 


2000 


ABC tran 


ABC transporter 


2.5e~43 


146.2 


1 


118-301 


2002 


Acyl-CoA_dh_M 


Acyl-CoA dehydrogenase, middle 
domain 


0.0071 


11.7 




99-136 


2002 


Acyl-CoA_dh 


Acyl-CoA dehydrogenase, C-terminal 
doma 


6.7e-50 


175.9 




415-566 


2003 


C tripleX 


Cysteine rich repeat 


2e-05 


17.8 




76-93 


2003 


EGF 


EGF-like domain 


8.7e-06 


23.6 




115-143 


2003 


TIL 


Trypsin Inhibitor like cysteine rich 
domain 


0.0035 


11.0 




134-155 


2003 


EGF 


EGF-like domain 


7.5e-05 


20.2 


3 


155-189 
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2003 


TIL 


Trypsin Inhibitor like cysteine rich 
domain 


0.26 


5.1 


2 


168-195 


2003 


EGF 


EGF-like domain 


4.4e-05 


21.1 


4 


195-228 


2003 


EGF 


EGF-like domain 


9.7e-09 


34.3 


5 


240-275 


2003 


MAM 


MAM domain 


9.2e-38 


135.6 


1 


421-566 


2004 


NHL 


NHL repeat 


l.le-10 


42.4 


1 


8-35 


2004 


NHL 


NHL repeat 


2.5e-09 


37.6 


2 


55-82 


2004 


NHL 


NHL repeat 


7.8e-ll 


43.0 


3 


102-129 


2005 


FCH 


Fes/CIP4 homology domain 


0.026 


10.3 


1 


310-350 


2005 


DAG PE-bind 


Phorbol esters/diacylglycerol binding 
dom 


2.8e-05 


21.7 


1 


738-776 


2005 


RhoGAP 


RhoGAP domain 


3.9e-68 


231.7 


1 


804-976 


2006 


CN hydrolase 


Carbon-nitrogen hydrolase 


4.5e-07 


26.2 


2 


117-206 


2007 


tsp I 


Thrombospondin type 1 domain 


0.054 


8.4 


1 J 


5-23 


2008 


Adaptin N 


Adaptin N terminal region 


7.5e-09 


29.6 


1 


1-51 


2008 


Alpha adaptinC2 


Adaptin C-tenninal domain 


4.4e-38 


126.8 


1 


183-296 


2008 


Alpha adaptin_C 


Alpha adaptin AP2, C-terminal domain 


1.6e- 
113 


334.2 


1 


302-414 


2009 


iff 


Immunofflobulin domain 


0.0045 


14.4 


1 


42-129 


2009 


__2 , 

iff 


Immunoglobulin domain 


0.19 


8.3 


2 


179-272 


2009 


fe . — 

iff 


Immunoglobulin domain 


9.7e-05 


20.6 


3 


319-408 


2009 


iff 


Immunoglobulin domain 


0.00014 


20.0 


4 


455-546 


2010 


iff 


Immunoglobulin domain 


0.0045 


14.4 


1 


42-129 


2010 


ig 


Immunoglobulin domain 


0.19 


8.3 


2 


179-272 


2010 


iff 


Immunoglobulin domain 


9.7e-05 


20.6 


3 


319-408 


2010 


ig 


Immunoglobulin domain 


0.00014 


20.0 


4 


455-546 


2011 


ig 


Immunoglobulin domain 


0.0045 


14.4 


1 


42-129 


2011 


ig 


Immunofflobulin domain 


0.19 


8.3 


2 


179-272 


2011 


iff 


Immunoglobulin domain 


9.7e-05 


20.6 


3 


319-408 


2011 


iff 


Immunoglobulin domain 


0.00014 


20.0 


4 


455-546 


2012 


7 — — 

ig 


Immunoglobulin domain 


0.0045 


14.4 


I 


42-129 


2012 


iff 


Immunoglobulin domain 


0.19 


8.3 


2 


179-272 


2012 


iff 


Immunoglobulin domain 


9.7e-05 


20.6 


3 


319-408 


2012 


ig 


Immunoglobulin domain 


0.00014 


20.0 


4 


455-546 


2016 


TFA 


Transcription elongation factor A, Sll-r 


3.4e-23 


87.2 


1 


148-283 


2018 


cadherin 


Cadherin domain 


8e-l3 


46.4 


1 


1-49 


2018 


cadherin 


Cadherin domain 


4.5e-09 


33.4 


2 


76-120 


2019 


PGM_PMM 


Phosphoglucomutase/phosphomannom 
utase, C-ter 


0.041 


9.3 


1 


347-389 


2020 


MACPF 


MAC/Perforin domain 


0.00017 


15.5 


1 


132-164 


2021 


KRAB 


KRAB box 


6.9e-24 


88.6 


1 


54-94 


2022 


KRAB 


KRAB box 


6.9e-24 


88.6 


1 


54-94 


2023 


EMP24 GP25L 


emp24/gp25L/p24 family 


1.9e-15 


55.4 


1 


17-78 


2024 


acid_phosphat 


Histidine acid phosphatase 


7.9e- 
159 


537.8 


1 


35-375 


2026 


KRAB 


KRAB box 


l.le-20 


77.0 


1 


132-172 


2026 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-07 


33.4 


1 


485-507 


2026 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.54 


2.9 


1 


500-518 


2026 


zf-C2H2 


Zinc finger, C2H2 type 


L3e-05 


27.2 


2 


513-535 


2026 


zf-C2H2 


Zinc finger, C2H2 type 


3.4e-08 


37.4 


3 


543-565 


2026 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.032 


6.3 


2 


558-576 


2026 


zf-C2H2 


Zinc finger, C2H2 type 


5.7e-06 


28.6 


4 


571-593 


2027 


Vpsl6_N 


Vpsl6, N-terminal region 


2.3e- 
107 


366.9 


1 


1-165 
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2027 


Vpsl6_C 


Vpsl6, C-terminal region 


5e-203 


684.6 


1 


262-580 


2028 


LRRNT 


Leucine rich repeat N-terminal domain 


0.0011 


14.5 


1 


11-40 


2029 


A2M 


Alpha-2-macroglobulin family 


6.3e-23 


75.5 


1 


4-86 


2031 


fh3 


Fibronectin type III domain 


4.9e-08 


29.7 


1 


3-33 


2031 


fn3 


Fibronectin type III domain 


9.1e-08 


28.7 


2 


46-136 


2032 


LRR 


Leucine Rich Repeat 


0.021 


10.8 


I 


2-25 


2032 


LRR 


Leucine Rich Repeat 


3e-05 


20.4 


2 


26-49 


2032 


LRR 


Leucine Rich Repeat 


0.00019 


17.8 


3 


50-73 


2032 


LRR 


Leucine Rich Repeat 


0.16 


7.8 


4 


74-94 


2032 


LRRCT 


Leucine rich repeat C-terminal domain 


2.2e-05 


17.6 


1 


118-132 


2033 


LRR 


Leucine Rich Repeat 


0.021 


10.8 


1 


2-25 


2033 


LRR 


Leucine Rich Repeat 


3e-05 


20.4 


2 


26-49 


2033 


LRR 


Leucine Rich Repeat 


0.00019 


17.8 


3 


50-73 


2033 


LRR 


Leucine Rich Repeat 


0.16 


7.8 


4 


74-94 


2033 


LRRCT 


Leucine rich repeat C-terminal domain 


2.2e-05 


17.6 


1 


118-132 


2034 


EGF 


EGF-like domain 


0.76 


5.8 


1 


135-157 


2034 


SEA 


SEA domain 


4.9e-06 


22.1 


I 


192-261 


2034 


ie 


Immunoglobulin domain 


9.8e-07 


28.1 


1 


310-376 


2034 


p 

ig 


Immunoglobulin domain 


0.33 


7.4 


2 


509-571 


2034 


GPS 


Latrophilin/CL-l-like GPS domain 


2e-14 


54.5 


1 


975- 
1027 


2034 


7tm_2 


7 transmembrane receptor (Secretin 
family) 


2.8e-20 


71.1 


2 


1086- 
1298 


2035 


TFIIS 


Transcription factor S-II (TFIIS) 


0.019 


10.6 


1 


21-31 


2035 


zf-C2H2 


Zinc finger, C2H2 type 


3.1e-06 


29.7 


2 


21-43 


2035 


zf-C2H2 


Zinc finger, C2H2 type 


2.7e-07 


33.9 


3 


49-71 


2035 


zf-BED 


BED zinc finger 


0.63 


4.8 


1 


50-72 


2035 


XPA N 


2/4 46 56.. 1 11 


0.22 


7.0 


3 


74-86 


2035 


zf-C2H2 


Zinc finger, C2H2 type 


8.8e-08 


35.9 


4 


77-99 


2035 


TFIIS 


Transcription factor S-II (TFIIS) 


0.036 


9.7 


4 


105-115 


2035 


zf-C2H2 


Zinc finger, C2H2 type 


0.0096 


15.6 


5 


105-120 


2038 


zf-C2H2 


Zinc finger, C2H2 type 


0.0099 


15.5 


1 


197-220 


2039 


FHA 


FHA domain 


0.024 


11.6 


1 


45-110 


2039 


HIT 


HIT domain 


0.013 


8.5 


1 


201-226 


2039 


zf-C2H2 


Zinc finger, C2H2 type 


0.026 


13.9 


1 


337-359 


2040 


FHA 


FHA domain 


0.024 


11.6 


1 


45-110 


2040 


HIT 


HIT domain 


0.013 


8.5 


1 


201-226 


2040 


zf-C2H2 


Zinc finger, C2H2 type 


0.026 


13.9 


1 


337-359 


2041 


FHA 


FHA domain 


0.024 


11.6 


1 


45-110 


2041 


HIT 


HIT domain 


0.013 


8.5 


1 


201-226 


2041 


zf-C2H2 


Zinc finger, C2H2 type 


0.026 


13.9 


1 


337-359 


2042 


Cwf Cwc_15 


Cwfl5/Cwcl5 cell cycle control protei 


8.6e- 
161 


544.3 


1 


2-230 


2043 


SRCR 


Scavenger receptor cysteine-rich 
domain 


6.5e-15 


54.2 


1 


8-113 


2043 


Lysyl_oxidase 


Lysyl oxidase 


1.9e- 
140 


476.7 


1 


117-286 


2045 


WD40 


WD domain, G-beta repeat 


0.5 


6.4 


2 


192-217 


2045 


WD40 


WD domain, G-beta repeat 


5.2e-06 


23.8 


3 


248-274 


2045 


DUF130 


Domain of unknown function DUF130 


0.074 


5.9 


1 


264-278 


2045 


WD40 


WD domain, G-beta repeat 


0.35 


7.0 


4 


397-424 


2048 


CTPJransfJ 


Cytidylyltransferase family 


4.9e- 
124 


422.2 


1 


86-417 


2049 


CBM 20 


Starch binding domain 


0.078 


8.5 


1 


14-33 
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2049 


WD40 


WD domain, G-beta repeat 


3.9e-08 


31.2 


1 


93-131 


2052 


7tmJ3 


7 transmembrane receptor 
(mctabotropic gluta 


5.8e-06 


21.5 


I 


112-153 


2052 


KdgT 


2-keto-3-deoxygluconate permease 


0.068 


7.2 


I 


200-226 


2052 


^3. — 

7tm_3 


7 transmembrane receptor 
(metabotropic gluta 


7.8e-05 


17.4 




221-305 


2054 


HesB-like 


HesB-like domain 


2.8e-4l 


132.5 


1 


52-154 


2055 


ig 


Immunoglobulin domain 


0.032 


11.2 


1 


37-59 


2055 


is 


Immunoglobulin domain 


0.00033 


18.6 




98-157 


2056 


Mpvl7 PMP22 


Mpvl7/PMP22 family 


8e-14 


51.5 


1 


101-163 


2058 


Collagen . 


Collagen triple helix repeat (20 copies) 


0.013 


12.1 




17-38 


2058 


Collagen 


Collagen triple helix repeat (20 copies) 


2.5e-07 


29.8 




40-79 


2058 


vwa 


von Willebrand factor type A domain 


3.2e-13 


42.1 




108-156 


2059 


Sterol desat 


Sterol desaturase 


8.6e-41 


138.1 


1 


1-139 


2060 




Immunoglobulin domain 


0.27 


7.7 


\ 


8-26 


2060 




Immunoglobulin domain 


5.2e-08 


32.9 




97-158 


2061 


RNA helicase 


RNA helicase 


0.00029 


15.0 


\ 


40-63 


2061 


AAA 


ATPase family associated with various 
ce 


0.00038 


13.8 




42-58 


2061 


NACHT 


NACHT domain 


0.0022 


12.0 


— 


44-66 


2061 


ADK 


Adenylate kinase 


2.2e-05 


19.0 




77-124 


2064 


UDPGT 


UDP-glucoronosyl and UDP-glucosyl 
transferas 


9.7e-34 


118.7 


1 


1-63 


2065 


TRAPP_Bet3 


Transport protein particle (TRAPP) 
compone 


9e-70 


242.0 


1 


18-171 


2066 


DUF846 


Eukaryotic protein of unknown 
function (DUF8 


0.013 


7.4 


1 


83-101 


2068 




Immunoglobulin domain 


0.0042 


14.5 


\ 


33-110 


2068 


FliL 


Flagellar basal body-associated protein 
FliL 


0.029 


9,2 




170-203 


2068 


DcuC 


C4-dicarboxylate anaerobic carrier 


0.044 


7.9 


1 


174-193 


2069 


ig 


Immunoglobulin domain 


0.0042 


14.5 


1 


33-110 


2069 


FliL 


Flagellar basal body-associated protein 
FliL 


0.029 


9.2 


1 


170-203 


2069 


DcuC 


C4-dicarboxylate anaerobic carrier 


0.044 


7.9 




174-193 


2070 


ig 


Immunoglobulin domain 


0.0042 


14.5 


i — ~ 


33-110 


2070 


FliL 


Flagellar basal body-associated protein 
FliL 


0.029 


9.2 


1 


170-203 


2070 


DcuC 


C4-dicarboxylate anaerobic carrier 


0.044 


7.9 


1 


174-193 


2071 


PH 


PH domain 


1.9e-21 


72.0 


1 


75-173 


2072 


Ifi-6-16 


Interferon-induced 6-16 family 


3.7e-46 


159.7 


1 


41-123 


2073 


Ifi-6-16 


Interferon-induced 6-16 family 


3.7e-46 


159.7 


1 


41-123 


2074 


Ribosomal L34e 


Ribosomal protein L34e 


3.5e-72 


232.6 


1 


12-110 


2075 


CDC50 


LEM3 (ligand-effect modulator 3) 
family/ CD 


0.049 


6.6 


1 


90-117 


2077 


EGF 


EGF-like domain 


0.0019 


15.2 




60-95 


2078 


EGF 


EGF-Uke domain 


0.0019 


15.2 




60-95 


2079 


EGF 


EGF-like domain 


0.0019 


15.2 




60-95 


2080 


ig 


Immunoglobulin domain 


4.9e-06 


25.4 




109-171 


2081 


Monooxygenase 


Monooxygenase 


0.0069 


10.9 




593-611 


2081 


ras 


Ras family 


7.2e-10 


33.6 




924-967 


2082 


Alpha_adaptin_C 


Alpha adaptin AP2, C-terminal domain 


0.061 


5.2 




97-109 


2082 


MHC I 


Class I Histocompatibility antigen, d 


0.00048 


14.9 


2 


125-210 


2081 


ig 


Immunoglobulin domain 


4.1e-05 


22.0 


1 


10-78 
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ZUOJ 




Immunoglobulin domain 


3.7e-10 


40.9 


2 


113-172 


90R^ 


1 CI 

Ig 


Immunoglobulin domain 


0.0018 


15.9 


3 


211-272 


ZUOj 


IS 


Immunoglobulin domain 


3.7e-08 


33.4 


4 


309-370 


2083 


DNA__poI_B_2 


DNA polymerase type B, organellar 
and 


0.018 


7.9 


1 


326-382 


ZUoJ 


AonA 


Opacity-associated protein A 


0.44 


2.4 


1 


335-357 


ZUo J 


ig 


Tmmnnnplobulin domain 


0.0012 


16.6 


5 


404-465 


ZUoo 


ig 


Immunnclohulin domain 


7.7e-07 


28.5 


6 


500-564 


ZUoh 


IK 


Tmmiinnolohulin domain 


4.1e-05 


22.0 


1 


10-78 


ZU54 




Immunoglobulin domain 


3.7e-10 


40.9 


2 


113-172 


ZUo*f 





Immunoglobulin domain 


0.0018 


15.9 


3 


211-272 


ZU54 




Immiinoolohulin domain 


3.7e-08 


33.4 


4 


309-370 


2084 


DNA_j)olJB_2 


DNA polymerase type B, organellar 

dllU 


0.018 


7.9 


1 


326-382 


ZUoh- 




Onapitv-assoniated nrotein A 


0.44 


2.4 


1 


335-357 


ZUO*f 


ig 


Immunoglobulin domain 


0.0012 


16.6 


5 


404-465 


ZU54 


ig 


Immiinnp'lnhiilin domain 


7.7e-07 


28.5 


6 


500-564 


ZUoj 


ig 


Immunoglobulin domain 


4.1e-05 


22.0 


1 


10-78 


9HR^ 
ZU53 


ig 


Immunoglobulin domain 


3.7e-10 


40.9 


2 


113-172 


90R5 


i§ 


Immunoglobulin domain 


0.0018 


15.9 


3 


211-272 


ZUoj 


ig 


Immunoglobulin domain 


3.7e-08 


33.4 


4 


309-370 


9ftR^ 
ZUOJ 


DMA not R 9 


DNA nolvmerase tvoe B. orsanellar 
and 


0.018 


7.9 


1 


326-382 


2085 


OapA 


Opacity-associated protein A 


0.44 


2.4 


1 


335-357 


90R^ 


l g 


Immunoglobulin domain 


0.0012 


16.6 


5 


404-465 




ig 


Immunoglobulin domain 


7.7e-07 


28.5 


6 


500-564 


zuoo 


rjj 


P53 


3.5e-09 


33.8 


1 


7-32 


90R7 

ZUO / 


A nnlinonrAffMn 


Apolipoprotein A1/A4/E family 


2.3e-ll 


42.3 


1 


93-168 


90R7 


DIJF260 


Protein of unknown function DUF260 


0.64 


3.5 


1 


94-107 


9ftR7 

ZUO / 


Adeno PIX 


Adenovirus hexon-associated protein ( 


0.49 


4.4 


1 


95-110 


2087 


BcrAD BadFG 


BadF/BadG/BcrA/BcrD ATPase family 


0.12 


6.2 


1 


134-180 


9HR7 
ZUo / 


A r\r\lir\onrf\f pin 


Anolinonrotein A1/A4/E family 


0.011 


10.5 


2 


172-258 


9fiR7 
ZUO / 


\A\A PnA miifim 
lvi ivi v^urv i nu Loo 


Methvlmalonvl-CoA mutase 


0.84 


1.9 


1 


264-306 


2088 


Anolinonrotein 


Apolipoprotein A1/A4/E family 


2.3e-ll 


42.3 


1 


93-168 


9088 


DUF260 


Protein of unknown function DUF260 


0.64 


3.5 


1 


94-107 


2088 


Adeno PIX 


Adenovirus hexon-associated protein ( 


0.49 


4.4 


1 


95-110 


ZUoo 


D pr An RaHTJTi 


RadF/BadG/BcrA/BcrD ATPase family 


0.12 


6.2 


1 


134-180 


ZUoo 


/vpoupoproiein 


Annlinnnrntpin A 1/A4/E familv 


0.011 


10.5 


2 


172-258 


ZUoo 


1V11V1 v>OrV_IIlUlao 
6 


Mpffivtmalonvl-CoA mutase 


0.84 


1.9 


1 


264-306 


90RQ 
zuoy 


HTTF717 
L/ur / i / 


Protein of unknown function (DUF717) 


1 


4.0 


1 


68-80 


90RQ 
zuoy 


ivi i 


Class I Histocomoatibilitv antigen, d 


0.69 


3.7 


1 


185-198 


zuyu 




Prwvim<? D5 nrotein-like 


1 


2.2 


1 


21-33 


2090 


phoslip 


Phospholipase A2 


3.4e-49 


172.4 


1 


26-150 


2090 


RFXJDNA_bindi 
ng 


RFX DNA-binding domain 


0.84 


2.9 


1 


55-62 


2092 


MR MLE N 


Mandelate racemase / muconate lactoni 


1.6e-05 


17.0 


1 


54-157 


2092 


Peptidase S26 


Signal peptidase I 


0.38 


3.8 


1 


99-129 


2092 


CheR N 


CheR methyltransferase, all-alpha dom 


0.4 


6.7 


I 


103-119 


2092 


MR MLE 


Mandelate racemase / muconate lactoni 


2.5e-08 


29.9 


1 


236-298 


2094 


PP2C 


Protein phosphatase 2C 


1.2e-71 


248.2 


1 


136-412 


2095 


EGF 


EGF-like domain 


0.64 


6.1 


1 


3-29 


2095 


EGF 


EGF-like domain 


6.2e-05 


20.5 


2 


35-68 
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2095 


EGF 


EGF-Iike domain 


0.00015 


19.1 


3 


94-131 


2096 


7tm 1 


7 transmembrane receptor (rhodopsin f 


8.6e-47 


138.9 


1 


83-332 


2097 


7tm I 


7 transmembrane receptor (rhodopsin f 


8.6e-47 


138.9 


1 


83-332 


2098 


DSL 


Delta serrate ligand 


0.018 


9.3 


1 


22-37 


2098 


EGF 


EGF-like domain 


0.0067 


13.2 




44-71 


2098 


TIL 


Trypsin Inhibitor like cysteine rich 


0.33 


4.8 


i 


46-66 


2098 


DSL _ t 


Delta serrate ligand 


0.48 


4.7 




56-71 


2099 


TEP1 N 


TEP1 N-tenninal domain 


0.85 


4.7 


1 


36-65 


2099 


Pox A46 


Poxvirus A46 family 


0.55 


2.5 


i 


61-75 


2099 


ExoD 


Exopoiysaccharide synthesis, ExoD 


0.82 


2.4 




124-147 


2099 


RlioGAP 


RhoGAP domain 


4e-28 


95.9 


1 


161-255 


2102 


rnyosin head 


Myosin head (motor domain) 


6.3e-56 


189.4 


1 


9-183 


2102 


ATP bind2 


P-loop ATPase protein family 


0.16 


4.9 


1 


75-88 


2102 


PRK 


Phosphoribulokinase / Uridine kinase 
fa 


0.14 


5.2 


1 


77-88 


2103 


myosin head 


Myosin head (motor domain) 


6.3e-56 


189.4 


-i 


9-183 


2103 


ATP bind2 


P-loop ATPase protein family 


0.16 


4.9 


1 


75-88 


2103 


PRK 


Phosphoribulokinase / Uridine kinase 
fa 


0.14 


5.2 


1 


77-88 


2105 


kazal 


Kazal-type serine protease inhibitor 


8.4e-08 


33.5 


1 


73-117 


2105 


thyroglobulin_l 


Thyroglobulin type-1 repeat 


7.7e-19 


72.8 




255-317 


2108 


BEX 


Brain expressed X-linked like family 


9.8e-86 


266.4 


i 


79-190 


2108 


ChaC 


ChaC-like protein 


0.2 


4.5 


1 


132-157 


2108 


IlvC 


Acetohydroxy acid isomeroreductase, 
ca 


0.14 


5.9 


1 


133-162 


2109 


LRRCT 


Leucine rich repeat C-terminal domain 


8.5e-09 


28.1 


! 


45-91 


2109 


UPF0118 


Domain of unknown function DUF20 


1 


2.9 


l 


219-242 


2112 


Inh 


Protease inhibitor Inh 


0.026 


9.0 


l 


19-44 


2112 


ank 


Ankyrin repeat 


0.0042 


14.2 


l 


26-45 


2113 


DUF370 


Domain of unknown function 
(DUF370) 


1 


3.5 


l 


24-39 


2113 


ApoL 


Apolipoprotein L 


4e-191 


645.1 


l 


46-348 


2113 


HupH C 


HupH hydrogenase expression protein, 


0.99 


2.7 




119-134 


2114 


DUF370 


Domain of unknown function 
(DUF370) 


1 


3.5 


i — 


24-39 


2114 


ApoL 


Apolipoprotein L 


4e-191 


645.1 




46-348 


2114 


HupH_C 


HupH hydrogenase expression protein, 


0.99 


2.7 


i — 


119-134 


2115 


MAM 


MAM domain 


1.5e-43 


154.8 


l 


3-102 


2116 


MAM 


MAM domain 


1.5e-43 


154.8 


i 


3-102 


2117 


CBF 


CBF/Mak21 family 


0.00014 


14.4 


i 


32-65 


2118 


PLA2 B 


Lysophospholipase catalytic domain 


7.6e-30 


104.2 


i 


14-143 


2118 


DUF188 


Uncharacterized BCR, Yail/YqxD 
family CO 


0.9 


2.9 


l 


140-151 


2119 


PLA2 B 


Lysophospholipase catalytic domain 


7.6e-30 


104.2 


-] 


14-143 


2119 


DUF188 


Uncharacterized BCR, Yail/YqxD 
family CO 


0.9 


2.9 


i 


140-151 


2121 


p450 


Cytochrome P450 


l.6e-05 


16.5 




31-143 


2121 


Phage_attach 


Phage Head-Tail Attachment 


0.97 


1.6 




100-111 


2122 




Immunoglobulin domain 


1.5e-12 


49.8 




38-96 


2122 


if? 


Immunoglobulin domain 


2.3e-06 


26.7 




134-213 


2122 


CD36 


CD36 family 


0.38 


3.9 




246-271 


2122 


Neur chan mem 
b 


Neurotransmitter-gated ion-channel tra 


0.69 


2.3 




261-270 


?m 


is 


Immunoglobulin domain 


1.5e-12 


49.8 


l 


38-96 
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2123 


ig 


Immunoglobulin domain 


2.3e-06 


26.7 


2 


134-213 


2123 


CD36 


CD36 family 


0.38 


3.9 




246-271 


2123 


Neur chanjnem 
b 


Neurotransmitter-gated ion-channel tra 


0.69 


2.3 


1 


261-270 


2124 


IE 


Immunoglobulin domain 


1.5e-l2 


49.8 


\ 


38-96 


2124 


ig 


Immunoglobulin domain 


2.3e-06 


26.7 




134-213 


2124 


CD36 


CD36 family 


0.38 


3.9 


I 


246-271 


2124 


Neur chan mem 
b 


Neurotransmitter-gated ion-channel tra 


0.69 


2.3 


1 


261-270 


2125 


C2 


C2 domain 


0.15 


6.6 


1 


33-48 


2125 


C2 


C2 domain 


8.3e-37 


125.8 




92-180 


2126 


DUF1058 


Protein of unknown function 
(DUF1058) 


0.49 


2.3 


1 


80-93 


2126 


PepJvll2Bjprop 
ep 


Reprolysin family propeptide 


1.9e-05 


17.5 


1 


155-208 

■ 


2127 


Ifi-6-16 


Interferon-induced 6-16 family 


3.7e-46 


159.7 




41-123 


2127 


GLTT 


GLTT repeat (6 copies) 


0.18 


7.7 


-j 


50-78 


2127 


CRCB 


CrcB-like protein 


0.18 


7.1 




106-124 


2128 


abhydrolase 


alpha/beta hydrolase fold 


0.02 


9.2 


* 


74-127 


2128 


lipase 


Lipase 


0.64 


3.7 




98-126 


2128 


abhydrolase 


alpha/beta hydrolase fold 


0.0083 


10.5 




167-237 


2128 


DLH 


Dienelactone hydrolase family 


0.4 


3.6 




169-196 


2128 


LIP 


Secretory lipase 


0.012 


8.6 


1 


178-203 


2128 


UPF0227 


Uncharacterised protein family (UPF02 


0.38 


4.9 ' 


1 


179-209 


2128 


abhydrolase__2 


Phospholipase/Carboxylesterase 


0.015 


10.1 




180-203 


2128 


Peptidase M10 
N 


Matrix metalloprotease, N-terminal do 


0.63 


2.5 


1 


209-230 


2129 


abhydrolase 


alpha/beta hydrolase fold 


0.02 


9.2 


1 


74-127 


2129 


lipase 


Lipase 


0.64 


3.7 


1 


98-126 


2129 


abhydrolase 


alpha/beta hydrolase fold 


0.0083 


10.5 




167-237 


2129 


DLH 


Dienelactone hydrolase family 


0.4 


3.6 


1 


169-196 


2129 


LIP 


Secretory lipase 


0.012 


8.6 


1 


178-203 


2129 


UPF0227 


Uncharacterised protein family (UPF02 


0.38 


4.9 




179-209 


2129 


abhydrolase_2 


Phospholipase/Carboxylesterase 


0.015 


10.1 




180-203 


2129 


Peptidase M10 
N 


Matrix metalloprotease, N-terminal do 


0.63 


2.5 


1 


209-230 


2130 


Collagen 


Collagen triple helix repeat (20 copie 


1.4e-06 


27.0 


1 


1 *> o 

1-38 


2130 


Collagen 


Collagen triple helix repeat (20 copie 


2.5e-05 


22.3 




39-74 


2130 


SRCR 


Scavenger receptor cysteine-rich domai 


2.6e-l6 


59.1 


1 


90-126 


2131 


Collagen 


Collagen triple helix repeat (20 copie 


1.4e-06 


27.0 


1 


1-38 


2131 


Collagen 


Collagen triple helix repeat (20 copie 


2.5e-05 


22.3 




39-74 


2131 


SRCR 


Scavenger receptor cysteine-rich domai 


2.6e-16 


59.1 


1 


90-126 


2132 


RICH 


RICH domain 


0.3 


5.4 




290-320 


2132 


DUF260 


Protein of unknown function DUF260 


0.047 


7.1 


1 


425-447 


2132 


Ter 


DN A replication terminus site-binding 


0.019 


7.5 




427-450 


2132 


Tropomyosin 


Tropomyosin 


U.Z/ 




~T 


*tuo vvu 


2132 


Adeno PIX 


Adenovirus hexon-associated protein ( 


0.044 


8.0 




482-506 


2132 


AgrD 


Staphylococcal AgrD protein 


0.83 


5.2 




501-508 


2132 


K-box 


K-box region 


0.0023 


12.6 




569-602 


2132 


Tfb2 


Transcription factor Tfb2 


0.98 


-1.2 




591-610 


2132 


RRF 


Ribosome recycling factor 


0.5 


5.0 


2 


696-727 


2132 


G-gamma 


GGL domain 


0.33 


5.0 


1 


717-738 


2132 


DUF260 


Protein of unknown function DUF260 


0.39 


4.2 


2 


821-843 


2132 


bZIP 


bZIP transcription factor 


0.52 


5.5 


2 


835-873 
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2132 


Lipoprotein_.il 


Lepidopteran low molecular weight (30 


1 


2.9 




850-867 


2132 


DNA_iigase_N 


NAD-dependent DNA ligase 
adenylation 


0.081 


5.7 


i 


868-888 


2133 


KRAB 


KRAB box 


2.9e-27 


100.7 


1 


61-101 


2133 


Androgen recep 


Androgen receptor 


0.71 


0.7 


1 


70-80 


2133 


TFIIS 1 


Transcription factor S-II (TFIIS) 


0.73 


5.1 


1 


324-334 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


3.5e-05 


25.4 J 


1 


324-346 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-06 


31.2 




352-374 


2133 


zf-BED 


BED zinc finger 


0.33 


5.7 


i 


354-375 


2133 


mRNA__cap_enzy 
me 


mRNA capping enzyme, catalytic 
domain 


0.56 


0.5 


1 


377-392 


2133 


XPA N 


XPA protein N-terminal 


0.78 


5.1 


2 


377-389 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


2.9e-07 


33.8 


3 


380-402 


2133 


TFIIS 1 


Transcription factor S-II (TFIIS) 


0.89 


4.8 


3 


408-418 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


2e-06 


30.4 


4 


408-430 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


1.6e-05 


26.8 


5 


436-458 


2133 


mRNA_cap__enzy 
me 


mRNA capping enzyme, catalytic 
domain 


0.56 


0.5 


2 


461-476 


2133 


XPA N 


XPA protein N-terminal 


0.78 


5.1 


4 


461-473 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


5.4e-07 


32.7 


6 


464-486 


2133 


TFIIS 


Transcription factor S-II (TFIIS) 


0.29 


6.5 


5 


492-502 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-06 


31.5 


7 


492-514 


2133 


XPA_N 


XPA protein N-terminal 


0.13 


7.8 


6 


517-529 


2133 


TFIIS 


Transcription factor S-II (TFIIS) 


0.57 


5.5 


6 


520-530 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


9.2e-07 


31.8 


8 


520-542 


2133 


XPA N 


XPA protein N-terminal 


0.97 


4.8 


7 


545-557 


2133 


TFIIS 


Transcription factor S-II (TFIIS) 


0.14 


7.6 


7 


548-558 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


4.4e-06 


29.1 


9 


548-570 


2133 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.38 


3.3 


1 


560-581 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-06 


31.5 


10 


576-598 


2133 


TFIIS 


Transcription factor S-II (TFIIS) 


0.054 


9.0 


8 


604-614 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


2.9e-07 


33.8 


11 


604-626 


2133 


zf-BED 


BED zinc finger 


0.64 


4.8 


3 


609-627 


2133 


DC1 


1/2 604 619.. 19 44 


0.16 


6.2 


2 


632-647 


2133 


zf-C2H2 


Zinc finger, C2H2 type 


0.00082 


19.9 


12 


632-655 


2137 


aminotran 3 


Aminotransferase class-Ill 


1.2e-09 


31.3 


1 


55-114 


2137 


OATP N 


Organic Anion Transporter Polypeptide 


0.81 


4.0 


1 


140-158 


2137 


aminotran 3 


Aminotransferase class-Ill 


8.1e-63 


208.6 


2 


181-409 


2138 


aminotran 3 


Aminotransferase class-Ill 


1.2e-09 


31.3 


1 


55rll4 


2138 


OATP N 


Organic Anion Transporter Polypeptide 


0.81 


4.0 


1 


140-158 


2138 


aminotran 3 


Aminotransferase class-Ill 


8.1e-63 


208.6 


2 


181-409 


2139 


trypsin 


Trypsin 


1.3e-25 


79.1 


1 


8-114 


2140 


Glycos transf l 


Glycosyl transferases group 1 


1.7e-17 


64.4 


1 


99-194 


2141 


MHYT 


Bacterial signalling protein N termina 


0.6 


4.2 


1 


291-328 


2142 


EGF 


EGF-like domain 


8.8e-09 


34.4 


1 


1-30 


2142 


EGF 


EGF-iike domain 


1.5e-07 


30.0 


2 


41-72 


2142 


EGF 


EGF-like domain 


0.0091 


12.7 


3 


82-107 


2142 


EB 


EB module 


0.077 


7.1 


2 


116-148 


2142 


EGF 


EGF-like domain 


L3e-07 


30.2 


4 


116-148 


2142 


EGF 


EGF-like domain 


0.022 


11.3 


5 


157-181 


2143 


AdoHcyase 


S-adenosyl-L-homocysteine hydrolase 


0.0022 


9.6 


1 


1-15 


2143 


AdoHcyase NA 
D 


S-adenosyl-L-homocysteine hydrolase, 
NA 


0.0012 


13.8 


1 


16-27 


2144 


UQ con 


Ubiquitin-conjugating enzyme 


0.0058 


11.9 


1 


31-61 
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2144 


UO con 


Ubiquitin-conjugating enzyme 


5.8e-25 


91.7 


2 


89-156 


2145 


Prominin 


Prominin 


2.1e- 
113 


364.0 


1 j 


25-213 


2145 


SPDY 


Domain of unknown function 
(DUF317) 


0.15 


6.5 


1 


87-100 


2145 


Prominin 


Prominin 


6.7e-30 


94.4 


2 


214-286 




Prnminin 

I lUlttiltlll 


Prominin 


l.9e- 
139 


448.0 


3 


287-510 


2146 


fibrinoeen C 


Fibrinogen beta and gamma chains, C- 
ter 


6.5e-54 | 


184.0 


1 


13-231 


2147 


fibrinogen C 


Fibrinogen beta and gamma chains, C- 
ter 


6.5e-54 


184.0 


1 


13-231 


2148 


fibrinogen_C 


Fibrinogen beta and gamma chains, C- 
ter 


6.5e-54 


184.0 


1 


13-231 


2150 


DUF381 


Domain of unknown function 
(T)UF381) 


0.48 


4.4 


1 


29-35 


2151 


aa_permeases 


Amino acid permease 


7e-24 


89.2 


1 


6-294 


2151 


Pox 15 


Poxvirus protein 15 


0.24 


6.0 


1 


85-102 


2151 


serine carbpept 


Serine carboxypeptidase 


0.41 


2.3 


1 


301-321 


2153 


spectrin 


Spectrin repeat 


0.4 


5.5 


1 


410-463 


2154 


spectrin 


Spectrin repeat 


0.4 


5.5 


1 


410-463 


2155 


Peptidase M20 


Peptidase family M20/M25/M40 


0.00038 


14.5 


1 


39-120 


2156 


JL =3 

sugar tr 


Sugar (and other) transporter 


0.11 


5.5 


1 


47-103 


2156 


OctopineJDH 


NAD/NADP octopine/nopaline 
dehydrogenas 


0.26 


4.6 




153-169 


2156 


sugar tr 


Sugar (and other) transporter 


5e-08 


28.1 




201-336 


2159 


bromodomain 


Bromodomain 


9.5e-45 


158.8 


1 


74-163 


2159 


bromodomain 


Bromodomain 


3e-40 


143.5 




367-456 


2159 


Alpha adaptin C 


Alpha adaptin AP2, C-terminal domain 


0.48 


2.6 


1 


406-418 


2159 


Phage X 


Phage X family 


0.97 


3.7 




449-480 


2159 


eIF3c N 


Eukaryotic translation initiation fac 


0.51 . 


1.2 




484-570 


2159 


Vitellogenin N 


Lipoprotein amino terminal region 


0.61 


1.5 


1 


495-550 


2159 


Herpes U44 


Herpes virus U44 protein 


0.47 


3.1 


1 


526-540 


2161 


ig 


Immunoglobulin domain 


6.4e-06 


25.0 


1 


58-118 


2164 


pkinase 


Protein kinase domain 


2.6e-38 


136.6 


1 


7-108 


2164 


TMP 


TMP repeat 


0.37 


8.0 




74-84 


2165 


pkinase 


Protein kinase domain 


2.6e-38 


136.6 




7-108 


2165 


TMP 


TMP repeat 


0.37 


8.0 




74-84 


2166 


glutaredoxin 


Glutaredoxin 


0.00075 


15.0 




12-65 


2166 


GST N 


Glutathione S -transferase, N-terminal 


0.019 


11.1 


1 


13-63 


2166 


GST C 


Glutathione S-transferase, C-terminal 


0.00013 


17.6 




189-281 


2166 


UL21 


Herpesvirus UL21 


0.98 


0.3 


1 


212-240 


2167 


PadR 


Transcriptional regulator PadR-like f 


0.22 


6.1 




18-31 


2167 


Collagen 


Collagen triple helix repeat (20 copi 


2.4e-05 


22.3 


1 


43-76 


2167 


Collagen 


Collagen triple helix repeat (20 copi 


1.5e-07 


30.6 




77-122 


2167 


Clq 


Clq domain 


2.9e-72 








2167 


TOBE 


TOBE domain 


0.5 


6.3 




223-242 


2169 


Sec6 


Exocyst complex component Sec6 


0.71 


2.3 




166-194 


2169 


BRCT 


BRCAl C Terminus (BRCT) domain 


0.0053 


11.4 




278-315 


2169 


Chitin bind 3 


Chitin binding domain 


0.95 


2.1 




308-321 


2169 


BRCT 


BRCAl C Terminus (BRCT) domain 


0.00072 


14.3 


2 


329-369 


2169 


BRCT 


BRCAl C Terminus (BRCT) domain 


5.7e-19 


65.1 


3 


378-451 


2169 


BRCT 


BRCAl C Terminus (BRCT) domain 


4e-l9 


65.6 


4 


536-622 


2169 


RinB 


Transcriptional activator RinB 


0.33 


5.4 


1 


595-646 
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9169 


BRCT 


BRCA1 C Tenninus (BRCT) domain 


0.028 


9.0 


5 


645-680 


9170 


Sec6 


Exocyst complex component Sec6 


0.71 


2.3 


1 


166-194 


9170 


BRCT 


BRCA1 C Terminus (BRCT) domain 


0.0053 


11.4 


1 


278-315 


2170 


Chitin bind 3 


Chitin binding domain 


0.95 


2.1 


1 


308-321 


2170 


BRCT 1 


BRCA1 C Tenninus (BRCT) domain 


0.00072 


14.3 


2 


329-369 


91 70 
Zl /U 


DDPT 


BRCA1 C Terminus (BRCT) domain 


5.7e-19 


65.1 


3 


378-451 


2170 


BRCT 


BRCA1 C Tenninus (BRCT) domain 


4e-19 


65.6 


4 


536-622 


Zl /u 


Kino 


Tran<?crintiona1 activator RinB 


0.33 


5.4 


1 


595-646 


9 1 *7A 


tSKl^ 1 


RRPA 1 C Terminus (BRCT) domain 


0.028 


9.0 


5 


645-680 


9 1 99 

/.ill 




T.eueine rich reneat C-terminal domain 


8.5e-09 


28.1 




45-91 


9 1 TO 

21 /z 


i rDCni 1 R 


nnmnin of unknown function DUF20 


1 


2.9 


I 


219-242 


9 1 11 

21 15 




fmmiinofylohnlin domain 


7.9e-06 


24.7 


I 


39-93 1 


9 1 11 

21 / J 


XTo /™*o Pv 
INS Da. 


SnHiiim/ralriiim exchanger nrotein 


0.86 


4.3 


1 


133-148 


91 TX 
21/3 


PHY1 7 


Cytochrome C oxidase Conner 
chaperone 


0.68 


3.6 


I J 


196-209 


91 7A 

Z L /*T 


TR9 DPI HVA 

22 


TB2/DP1 HVA22 family 


3.8e-34 


123.6 


1 


18-111 


9174 


ELM2 1 


ELM2 domain 


0.53 


5.2 


1 


114-139 


917S 


An neroxidase 


Animal haem peroxidase 


1.3e-91 


311.6 


1 


2-232 


917^ 

Art 


7tm 1 


7 transmembrane receptor (rhodopsin f 


0.22 


2.7 


1 


24-32 


2175 


Pentidase CI 


Papain family cysteine protease 


0.76 


2.1 


1 


117-134 


2176 


An_peroxidase 


Animal haem peroxidase 


1.3e-91 


311.6 


-] 


2-232 


9176 
Zl /u 


7fm 1 


7 transmembrane receptor (rhodopsin f 


0.22 


2.7 




24-32 


9176 
Zl /o 


Ppnridase C1 


Papain family cysteine protease 


0.76 


2.1 


1 


117-134 


9177 


A n neroxi da^se. 


Animal haem peroxidase 


1.3e-91 


311.6 


1 


2-232 


9177 


7tm 1 


7 transmembrane receptor (rhodopsin f 


0.22 


2.7 


1 


24-32 


2177 


Peptidase_Cl 


Papain family cysteine protease 


0.76 


2.1 


4 — 


117-134 


917R 
Z l /o 


DT TFR46 


Eukaryotic protein of unknown functio 


0.0084 


8.0 




55-84 


91 70 

Zl 




Uncharacterised protein family (UPF01 


0.04 


7.4 


i 


341-366 


2179 


PS_pyruv trans 


Polysaccharide pyruvyl transferase 


0.55 


3.3 


-j 


355-411 


91 RH 
Zl oU 


PHY 17 


Cvtochrome C oxidase coDDer 
chaperone 


0.51 


4.0 





39-60 


01 Rfi 
Z loU 


T?TTa 
LvlLa 


Retnilatorv subunit of tvoe II PKA R-s 


le-14 


54.8 




67-104 


91 RO 
Z loU 


ouivru 


Surfeit locus nrotein 6 


0.027 


7.2 




84-155 


2180 


cNMP binding 


Cyclic nucleotide-binding domain 


7.2e-31 


112.5 





194-282 


o i co 


KlN A_pO l_ivpuZ_ 


PKTA nnlvm erase Rnb2 domain 4 

J\_l>ix». UU1 Y1J1G1 (wv uuituuii • 


0.28 


6.2 




226-233 


91 RO 
Z i ou 


/"NIA/fP ViindincT 

UiN IVJX Ul UUlllg 


Cyclic nucleotide-binding domain 


9.4e-32 


115.7 




312-406 


91 RO 
Z LOU 


ivicuiyiii alibi i 


6-O-Tnethvleuanine DNA 
methvltransfera 


0.64 


4.3 


i 


367-379 


91 R1 


pr\7 


PD7 domain (Also known as DHR or 
GLGF^ 


6.7e-12 


43.7 


1 


5-86 


91 R9 
Z I oZ 


PT AT 


PLAT/LH2 domain 


1.7e-31 


108.4 


l 


2-111 


91 99 


lipoxygenase 


XvipUAjrgvliadO 


3.9e- 
194 


655.1 


i 


113-624 


2182 


DUF181 


Uncharacterized ACR, COG1944 


0.81 


2.4 




221-232 


2182 


PG binding 1 


Putative peptidoglycan binding domain 


0.5 


5.6 




395-411 


2183 


PLAT 


PLAT/LH2 domain 


1.7e-31 


108.4 




2-111 


2183 


lipoxygenase 


Lipoxygenase 


3.9e- 
194 


655.1 




113-624 


2183 


DUF181 


Uncharacterized ACR, COG1944 


0.81 


2.4 




221-232 


2183 


PG binding_l 


Putative peptidoglycan binding domain 


0.5 


5.6 




395-411 


2184 


PLAT 


PLAT/LH2 domain 


1.7e-31 


108.4 




2-111 


2184 


lipoxygenase 


Lipoxygenase 


3.9e- 


655.1 




113-624 
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194 








2184 


DUF181 1 


Uncharacterized ACR. COG1944 


0.81 


2.4 


1 


221-232 


2184 


PG binding__l ) 


Putative pcptidoglycan binding domain 


0.5 


5.6 


1 


395-411 


2186 


TFIIS | 


Transcription factor S-II (TFIIS) 


I 

0.19 


4.6 

7.9 II 


1 
1 


11-21 
220-257 


2186 
2186 


DUF536 
FCH 


Protein of unknown function, DUF536 
Fes/CIP4 homology domain 


0.5 


5.6 


1 i 


265-284 


2192 


Aajrans j 


Transmembrane amino acid transporter 
prote 




33 4 ! 


1 


4-56 


| 2193 


EGF 


EGF-like domain 


0.024 


11.2 


1 


42-57 


1 2193 


EGF 


EGF-like domain 


1.3e-06 


26.6 


2 i 


60-88 


1 2193 


EGF L 


EGF-like domain 


1.2e-09 


37.5 


3 


95-128 


1 2193 


Cripto 1 


Cripto growth factor 


0.86 


3.4 


i 


101-132 


1 2193 


laminin EGF | 


1/3 32 60.. 2 43 


0.025 


9.9 


2 ! 


106-130 


2193 


EGF 


EGF-like domain 


5.5e-07 


27.9 ! 


4 


135-171 


2194 


M I 


M protein repeat 


0.8 
0.78 


7.1 
2.2 


1 

1 i 


64-84 
303-319 


1 2194 
2194 


PPl inhibitor } 
bZIP j 


PKC-activated protein phosphatase- 1 i 
1/2 65 82.. 48 65 


0.32 


6.2 | 


2 


397-415 


| 2194 


TSC22 I 


TSC-22/dip/bun family 


0.045 


7.2 


1 ! 


398-415 


2195 


ank 1 


Ankyrin repeat 


0.0017 


15.6 


2 


206-231 


2195 


G-patch 1 


G-patch domain 


2e-16 


58.7 


1 I 


319-363 


2195 


Anti-silence | 


Anti-suencing protein, Abr l-nice 


0.18 


5.1 


1 \ 


365-378 


2196 


endotoxin 


delta endotoxin 


0.85 


2.3 


1 1 


134-151 


2197 


Peptidase M24 


metallopeptidase family M24 


5.5e-69 
0.089 


239.3 
7.1 


1 
1 


103-342 
184-195 


1 2197 
2199 


DUF120 
PAAD DAPIN 


Domain of unknown function DUF120 
PAAD/DAPIN/Pyrin domain 


1.3e-ll 


41.6 


1 


18-103 


1 2199 
2199 


DHHAl 
UPF0160 


DHHAl domain 

Uncharacterised protein family (UPF01 


0.61 
1 


5.4 
2.3 


1 1 
1 


67-87 
75-86 


| 2199 
1 2199 


RNA helicase 
NACHT 


RNA helicase 

NACHT domain 


0.03 

3.8e-74 

0.15 


7.9 

252.4 

5.2 


1 

1 
1 


195- 215 

196- 365 

197- 215 


1 2199 
(2T99 


AAA 

Peptidase, S 15 


ATPase family associated with various 
X-Prodipeptidyl-peptidase(Sl5 famil 


0.64 


2.1 


1 


929-984 


2200 


PDZ 


PDZ domain (Also known as DHR or 
GLGF 


8.le-22 


78.5 


1 


35-114 


2200 
2200 


CDC50 
DUF100 


LEM3 (ligand-effect modulator 3) farm 
1 Protein of unknown function DUF100 


1 

0.2 


2.1 
4.1 


1 
1 


1 ai 1 1 a 
101-1 lo 

117-130 

1 ^0 1 A*y 


2201 
2201 


DIE2 ALG10 
DUF718 


DIE2/ALG10 family 

Protein of unknown function (DUF718) 


1.5e-54 
0.64 


191.4 
4.4 


1 
1 


62-142 
70-77 


1 2202 


rrm 


RNA recognition motif. (a.k.a. RRM, R 


1.3e-09 


36.2 


1 


76-143 


1 2202 


RbsD FucU 


RbsD / FucU transport protein family 


0.53 


3.4 


1 


138-162 


| 2202 


HemX 


HemX 


0.37 


3.5 


1 


157-188 


1 2202 


rrm 


RNA recognition motif. (a.k.a. RRM, R 


4.6e-13 


48.6 


2 


201-268 


1 2202 


nm 


RNA recognition motif. (a.ka. RRM, R 


4.3e-13 


48.7 


3 


354-421 


1 2202 


rrm 


1 RNA recognition motif. (a.k.a. RRM, R 


1.4e-06 


25.5 


4 


471-539 


1 2203 


C tripleX 


1 Cysteine rich repeat 


2e-05 


17.8 


1 


76-93 


2203 


Bowman- 
Birkjeg 


Bowman-Birk serine protease inhibitor 


1 


4.0 


1 


85-100 


2203 


laminin EGF 


Laminin EGF-like (Domains III and V) 


0.32 


6.1 


1 


97-110 


2203 


EGF 


EGF-like domain 


8.7e-06 


23.6 


2 


115-143 


2203 


TIL 


Trypsin Inhibitor like cysteine rich 


0.0035 


11.0 


1 


134-155 


.2203 


EGF 


EGF-like domain 


7.5e-05 


20.2 


3 


155-189 


2203 


TIL 


Trypsin Inhibitor like cysteine rich 


0.26 


5.1 


2 


168-195 


2203 


toxin 5 


Scorpion short toxin 


0.34 


4.4 


1 


170-175 


2203 


EGF 


EGF-like domain 


4.4e-05 


21.1 


4 


195-228 


2203 


EGF 


| EGF-like domain 


9.7e-09 


34.3 


1 5 


240-275 
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2203 


MAM 


MAM domain 


9.2e-38 


135.6 


1 


421-566 


2204 


C tripleX 


Cysteine rich repeat 


2e-05 


17.8 


1 


76-93 


2204 


Bowman- 
Birk leg 


Bowman-Birk serine protease inhibitor 


1 


4.0 


I 


85-100 


2204 


laminin EGF 


Laminin EGF-like ( Domains III and V) 


0.32 


6.1 


1 


97-110 


2204 


EGF j 


EGF-like domain 


8.7e-06 


23.6 


2 


115-143 


2204 


TIL 


Trypsin Inhibitor like cysteine rich 


0.0035 


11.0 


1 


134-155 


2204 


EGF 


EGF-like domain 


7.5e-05 


20.2 


3 


155-189 


2204 


TIL 


Trypsin Inhibitor like cysteine rich 


0.26 


5.1 


2 


168-195 


2204 


toxin 5 


Scorpion short toxin 


0.34 


4.4 


1 


170-175 


2204 


EGF _j 


EGF-like domain 


4.4e-05 


21.1 


4 


195-228 


2204 


EGF 


EGF-like domain 


9.7e-09 


34.3 . 


5 


240-275 


2204 


MAM 


MAM domain 


9.2e-38 


135.6 


1 


421-566 


2205 


TH1 


THl protein 


0.91 


0.2 


I 


315-328 


2205 


Neg reg 


Negative transcriptional regulator 


1 


2.3 


1 


587-596 


2205 


zf-MYND 


MYND fmger 


2e-08 


27.7 


1 


654-688 


2206 


TH1 


THl protein 


0.91 


0.2 


1 


315-328 


2206 


Neg reg 


Negative transcriptional regulator 


1 


2.3 


1 


587-596 


2206 


zf-MYND 


MYND finger 


2e-08 


27.7 


1 


654-688 


2207 


THl 


THl protein 


0.91 


0.2 


1 


315-328 


2207 


Neg reg 


Negative transcriptional regulator 


1 


2.3 


1 


587-596 


2207 


zf-MYND 


MYND finger 


2e-08 


27.7 


1 


654-688 


2208 


THl 


THl protein 


0.91 


0.2 


1 


315-328 


2208 


Neg reg 


Negative transcriptional regulator 


1 


2.3 


1 


587-596 


2208 


zf-MYND 


MYND finger 


2e-08 


27.7 


1 


654-688 


2209 


Urotensin II 


Urotensin II 


0.36 


5.4 


1 


82-92 


2209 


fh2 


Fibronectin type II domain J 


0.55 


3.5 


1 


83-91 


2210 


RNA helicase 


RNA helicase 


0.03 


7.9 


1 


91-111 


2210 


NACHT 


NACHT domain 


3.8e-74 


252.4 


1 


92-261 


2210 


AAA 


ATPase family associated with various 


0.15 


5.2 


1 


93-111 


2211 


disintegrin 


Disintegrin 


3.3e-36 


123.1 


1 


4-79 

*\ s\ i m r O 


2211 


EGF 


EGF-like domain 


0.0023 


14.8 


1 


231-258 


2213 


zf-MYND 


MYND finger 


1 


3.8 


1 


38-45 


2213 


ank 


Ankyrin repeat 


4.4e-06 


24.9 


1 


159-187 


2213 


ank 


Ankyrin repeat 


6.9e-09 


35.0 


2 


191-223 


2213 


ank 


Ankyrin repeat 


0.15 


8.6 


3 


224-256 


2213 


ank 


Ankyrin repeat 


9.7e-10 


38.0 


4 


258-290 


2213 


ank 


Ankyrin repeat 


0.00014 


19.5 


5 


291-336 


2213 


LolA 


Outer membrane lipoprotein carrier pr 


1 


3.0 


1 


317-340 


2213 


ank 


Ankyrin repeat 


3.8e-08 


32.3 


6 


337-369 


2213 


ank 


Ankyrin repeat 


0.49 


6.8 


7 


370-402 


2214 


interferon 


Interferon alpha/beta domain 


3.7e-42 


145.6 


1 


27-116 


2215 


DUF602 


Protein of unknown function, DUF602 


1.3e- 
202 


683.2 


1 


15-303 


2215 


Bromo MP 


Bromovirus movement protein 


0.062 


6.4 


- 1 


21-47 


2216 


DUF846 


Eukaryotic protein of unknown 
function 


0.012 


/.j 




120-150 


2218 


acid phosphat 


Histidine acid phosphatase 


5.5e-13 


45.0 




137-232 


2219 


PH 


PH domain 


1.9e-20 


68.8 




78-238 


2219 


ArfGap 


Putative GTPase activating protein fo 


5.4e-50 


174.5 




259-379 


2219 


ank 


Ankyrin repeat 


1.9e-09 


36.9 




418-450 


2219 


ank 


Ankyrin repeat 


0.022 


11.6 




451-475 


2219 


SapB 2 


Saposin-like type B, region 2 


0.33 


6.5 




464-475 


??70 


PH 


PH domain 


1.9e-20 


68.8 




78-238 
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2220 


ArfGap 


Putative GTPase activating protein fo 


5.4e-50 


174.5 




259-379 


2220 


ank 


Ankyrin repeat 


1.9e-09 


36.9 




418-450 


2220 


ank 


Ankyrin repeat 


0.022 


11.6 




451-475 


2220 


SapB 2 


Saposin-like type B, region 2 


0.33 


6.5 




464-475 


2221 


r — ■■ — 

PH 


PH domain 


1.9e-20 


68.8 




78-238 


2221 


ArfGap 


Putative GTPase activating protein fo 


5.4e-50 


174.5 




259-379 


2221 


ank 


Ankyrin repeat 


1.9e-09 


36.9 




418-450 


2221 


ank 


Ankyrin repeat 


0.022 


11.6 




451-475 


2221 


SapB 2 


Saposin-like type B, region 2 


0.33 


6.5 




464-475 


2222 


. . . ^ i— 

PH 


PH domain 


1.9e-20 


68.8 




78-238 


2222 


ArfGap 


Putative GTPase activating protein fo 


5.4e-50 


174.5 




259-379 


2222 


ank 


Ankyrin repeat 


1.9e-09 


36.9 




418-450 


2222 


ank 


Ankyrin repeat 


0.022 


11.6 




451-475 


2222 


SapB 2 


Saposin-like type B, region 2 


0.33 


6.5 




464-475 


2223 


Reprolysin 


Reprolysin (M12B) family zinc metallo 


2.4e-35 


127.6 




3-83 


2223 


Astacin 


Astacin (Peptidase family M12A) 


0.21 


5.0 




23-37 


2223 


Phi 1 


Phosphate-induced protein 1 conserved 


0.51 


3.3 




71-83 


2223 


disintegrin 


Disintegrin 


0.0019 


12.9 




101-136 


2224 


Uteroglobin 


Uteroglobin family 


6.6e-09 


29.8 




1-88 


2225 


Clq 


Clq domain 


6.3e-06 


23.8 




98-138 


2226 


Ornatin 


Ornatin 


0.55 


4.8 




99-106 


2227 


Ornatin 


Ornatin 


0.55 


4.8 




99-106 


2228 


Gag MA 


Matrix protein (MA), pl5 


0.11 


6.5 




96-152 


2229 


SeryljRNA_N 


Seryi-tRNA synthetase N-terminal 
doma 


0.92 


5.7 




241-258 


2229 


pentaxin 


Pentaxin family 


4.3e-24 


83.3 




363-526 


2229 


Avirulence 


Xanthomonas avirulence protein, Avr/P 


0.07 


3.6 




501-515 


2233 


ion trans 


Ion transport protein 


0.001 


14.0 




22-141 


2233 


Sarcolipin 


Sarcolipin 


0.56 


5.3 




95-123 


2234 


ion trans 


Ion transport protein 


0.001 


14.0 




22-141 


2234 


Sarcolipin 


Sarcolipin 


0.56 


5.3 




95-123 


2235 


zf-C2H2 


Zinc finger, C2H2 type 


0.033 


13.4 




100-123 

1 OA 1 1C 


2235 


TFIID-31 


Transcription initiation factor IID, 3 


0.28 


5.7 




120-135 


2235 


zf-C2H2 


Zinc finger, C2H2 type 


0.14 


10.9 




134-156 


2238 


asp 


Eukaryotic aspartyl protease 


l.le-24 


87.5 




1-67 


2239 


Sulfatase 


Sulfatase 


4.5e-05 


18.1 




57-122 


2240 


Zn carbOpept 


Zinc carboxypeptidase 


3.7e-57 


193.6 




13-156 


2241 


Zn carbOpept 


Zinc carboxypeptidase 


3.7e-57 


193.6 




13-156 


2242 


NifU N 


NifU-like N terminal domain 


l.7e-80 


277,6 




34-160 


2244 


zf-C2H2 


Zinc finger, C2H2type 


0.00035 


21.4 




56-81 


2244 


zf-C2H2 


Zinc finger, C2H2type 


0.012 


15.2 




90-117 


2244 


zf-C2H2 


Zinc finger, C2H2 type 


0.0039 


17.1 




123-147 


2245 


pkinase 


Protein kinase domain 


3.2e-90 


309.9 




49-341 


2245 


Glyco hydro_15 


Glycosyl hydrolases family 15 


0.18 


4.4 




501-551 


2248 


Ciq 


Clq domain 


5.1e-23 


86.7 




27-135 


2249 


Allantoicase 


Allantoicase repeat 


A A1 A 

0.014 








2249 


Allantoicase 


Allantoicase repeat 


1.3e-57 


196.4 




46-206 


2250 


DNA ligase A_ 
C 


ATP dependent DNA ligase C terminal 
r 


0.67 


5.4 




" 25-48 


7250 


ig 


Immunoglobulin domain 


0.00019 


19.5 




51-165 


2250 


is 


Immunoglobulin domain 


0.15 


8.7 




196-257 


2250 


ig 


Immunoglobulin domain 


0.0031 


15.0 




289-349 


2250 


SK_channel 


Calcium-activated SK potassium 
channe 


0.035 


7.1 




377-397 
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Position 


2251 


FH 


r ri aomain 


2.4e-24 


81.6 




43-153 


2251 


UC1CT 


Upnoron cui -fciti* 9-O-sulfotransferase 
ncpdrdn suii**'-^' ^ auuuuotwivnww 


0.27 


4.4 




160-182 


2251 


LMP 


LMP repeated region 


0.0012 


14.2 




180-201 


2251 


DUF603 


r rotein oi unKnuwu miituuu, lsvj* 


0.04 


6.4 




193-207 


2251 


Pox A type_inc 


Viral A-type inclusion protein repeat 


0.32 


7.2 




193-207 


2251 


10 


calmoauiin-Dinaing mom 


5e-05 


20.1 




226-246 


2251 


RhoGEF 


RhoGEF domain 


1.2e-69 


236.9 




267-448 


2251 


DUF674 


Protein ox unknown iuncuon ^uurw/t; 


0.82 


1.4 




295-305 


2251 


Stisl 


Stigma-specific protein, Stigl 


0.6 


2.3 




396-441 


2251 


PH 


PH domain 


2.3e-13 


45.3 




480-608 


2251 


RasGEFN 


Guanine nucleotide excnange iacior iu 


l.le-19 


71.3 




653-708 


2251 


RasGEF 


RasGEF domain 


7.2e-89 


305.4 




1019- 
1204 


2251 


Adenojerminal 


Adenoviral DNA terminal protein 


1 


1.7 




1195- 
1227 


2252 


DUF630 


frotein oi uniuiown iulu/uuii ^±-/*ja 


0.7 


4.3 




584-597 


2252 


FGF 


riDrouiast growtn iactor 


0.37 


4.4 




620-635 


2252 


tRNA-synt 2 


tKJNA syntneiases cid&s ix \ jv auu ^ 


0.74 


3.5 




646-658 


2252 


Omega-atracotox 


umega-auacoioxin 


0.15 


5.1 




751-758 


2253 


K tetra 


l\rr cnannei TetramenoaLiun uumain 


2e-34 


121.3 




26-114 


2253 


BTB 


15 lD/rUZ» aomain 


0.0015 


14.2 




74-125 


2254 


PXA 


rAA aomain 


0.01 


10.2 




90-110 


2254 


Vps52 


VpsjZ / oacz iamny 


0 


1089. 
2 




100-609 


2254 




trp syntA 


TVxrr»f nr\Vian cvntVm^P a1nVia chain 


0.78 


3.1 




179-216 


2254 


DUF965 


"Rar»+*»rinl nrntpin nf" unknown function 


0.33 


4.5 




291-304 


2255 


NosL 


XTr\eT 
INOS.Lt 


0.29 


4.9 




104-128 


2255 


NAC 


NAC domain 


0.76 


5.5 




150-172 


2255 


w-v T TTV\ A f\ 

DUF240 


\MC1(YX0 f\ACi(\Qf\r\ACY)$& familv 2 


0.17 


6.7 




176-191 


2256 


NosL 


INOSL, 


0.29 


4.9 




104-128 


2256 


NAC 


IN AC aomam 


0.76 


5.5 




150-172 


2256 


DUF240 


A/[Cift'X'7f\An(\Qf\rMG'2RR familv 2 


0.17 


6.7 




176-191 


2258 


zz 


/.me nnger, z»z» iyp c 


le-12 


48.2 




5-50 


2258 


SoxD 


c n-mncirif* rwiAocf* Hplta <mliiinit fami 

oaTCOSinc UAlUoow, uwu* ouuuuiv i.ttwu 


0.97 


4.2 




79-86 


2258 


ZI-C2H2 


7knn fin (tp r P9H9 tvne 

z,mc linger, v^zitz iyt' , «' 


0.00067 


20.3 




80-103 


2258 


ZI-C3HC4 


7i«^ fincr*»r CWCA tvne fRTNG fineer) 


0.3 


3.6 




95-115 


2258 


SPDY 


L/OmaiTl OI unKnuwii luiu/uvu 


0.6 


4.4 




119-133 


2258 


n: i ft 

Dil9 


r\r/Micrlit inHiippH 10 nrotein TDil9^ 


0.00056 


13.0 




314-330 


2261 


RmuC 


DmnP familv 

KmUL/ iamuy 


0.79 


3.1 




16-46 


2261 


IBN NT 


T»««j-M*ftr>_Vt*»ta M-tprmtnal domain 


2.1e-27 


99.5 




34-113 


2261 


Penplaj3P_hke 


r enpiasmic oinuing proicuia <mu q"&<* 


0.21 


4.7 




142-173 


2262 


Lasl 


Lasi-iiKe 


1.6e-94 


320.7 




55-203 


2262 


MuDR 


jViUUiv ianuiy uaiiapuawp 


0.17 


5.5 




231-263 


2262 


BAR 


BAR domain 


0.21 


5.2 




347-363 


2262 


Adeno E1BJ9K 


Adenovirus E1B 19K protein / small t- 


0.43 


4.6 




534-558 


2262 


META 


Domain of unknown function (306) 


0.91 


5.3 




632-663 


2263 


ank 


Ankyrin repeat 


0.00019 


19.0 




1-33 


2263 


DMRL synthase 


6,7-dimethyl-8-ribityllumazine synthas 


0.35 


5.0 




33-48 


2263 


hormone 


Somatotropin hormone family 


0.23 


2.6 




85-115 


2265 


iff 


Immunoglobulin domain 


4.1e-05 


22.0 




10-78 


2265 


ig 


Immunoglobulin domain 


3.7e-10 


40.9 


2 


113-172 


2265 


ig 


Immunoglobulin domain 


0.0018 


15.9 


3 


211-272 


7265 


ig 


Immunoglobulin domain 


3.7e-08 


33.4 


4 


309-370 
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i!/_v<uut; 


utUlc 


Rpnpafc 


Position 


2265 


DNA_pol_B_z 


DIN A polymerase type t>, organeiiar 
and 


0 018 


7 9 


1 


326-382 


2265 


OapA 




0.44 


2.4 


1 


335-357 


2265 


ig 


immunogiODUiin aomdiu 


0.0012 


16.6 ! 


5 


404-465 


2265 


ie 


Immunoglobulin domain 


7.7e-07 


28.5 


6 


500-564 


2269 


efhand 


nr nana 


2 8e-08 


33.9 


1 


60-88 


2269 


COX17 


Cytochrome C oxidase copper 
chaperone 


0.42 


4.2 


1 


85-92 


2269 


efhand 


sir nana 


0 0033 


15.3 


2 


96-124 


2269 


efhand 


EF hand 


8.5e-05 


21.1 


3 


133-161 


2269 


PCRF 


rCKr domain 


U.HJ ! 


6.1 




160-176 


2269 


DUF21 


Domain of unknown function DUF21 


0.18 


6.4 


1 


165-189 


2269 


efhand 


EF hand 




jO. I 


4 

*T 


169-197 


2270 


UPF0061 


uncharactenzed ACR, YdiU/UrrUUoi 
fam 


i ,ze- i*t 


J L.J 


1 
1 


15-61 


2270 


UPF0061 


Uncharactenzed ACK, Y ai U/ U r r uuo i 
fam 


£ Rp <\1 
O.og-jZ 


1 87 

10/..U 


2 


95-275 


2270 


Flavodoxin 2 


Flavodoxm-uke tola 


u.oo 


J.J 


i 
i 


369-384 


2270 


UPF0061 


Uncharacterized ACR, YdiU/UPF0061 
fam 


1.2e-05 


19.1 


3 


399-440 


2270 


UPF0061 


Uncharactenzed ACK, Y di u/ u Jf r uuo 1 
fam 


1 Qp_4Q 


174 5 
i /*t.j 


4 


501-654 


2270 


Flavodoxin 2 


Flavodoxin-like told 


u.oo 




2 


748-763 


2270 


UPF0061 


Uncharacterized ACR, YdiUAJPF0061 
fam 


1.2e-05 


19.1 


5 


778-819 


2271 


UPF0061 


Uncharacterized AUK, x di u/ Urr uuo 1 
fam 


1 ?p-14 


SI 3 

J 1 . J 




15-61 


2271 


UPF0061 


Uncharacterized AUK, Y di u/ u r r uuo i 
fam 




187 fi 


2 


95-275 


2271 


Flavodoxin 2 


Flavodoxin-like fold 




3 3 


1 


369-384 


2271 


UPF0061 


Uncnaracterizea auk, x ai u/ urr uuo i 
fam 




19 1 


3 


399-440 


2271 


urrOuol 


TTn^haro^teriTpH APR VHiT T/T TPF0061 

u ncnaracienzea /\t-4v, iaiu/urrwui 
fam 


1.9e-49 


174.5 


4 


501-654 




— : 

Flavodoxin 2 


T?1 r\\rr\A r\viTi_ 1 1 \ff* TO 1 fl 


0.66 


3.3 


2 


748-763 


22/1 


UrrUUOl 


TTnrfiflrartprirftH APR YdiU/UPF0061 
ialll 


1.2e-05 


19.1 


5 


778-819 


III I 


/im_i 


*7 trancmpmhranp rppentor TrhoaODsin 

/ Li allMlldllUl cuiv iti/t^iui ^mv/viv/^yoiAi 

fam 


9.7e-25 


72.7 


1 


1-107 


ZZ/J 




T fiifinp Rif*Vi RprtP5lt 


0.00057 


16.1 


1 


40-63 


ZZ/j 


T nn 


T onoinp RipVi Ppnpnt 


0.004 


13.3 


3 


88-113 


oooo 
22/3 






0.84 


5.4 


4 


114-131 


22 /4 


AMr-Dinaing 


AMr-oinaing enzyme 


6.9e-18 


64.1 


1 


20-135 


ZZ/j 


cytochrome c 


v^yiocnromc \, 


0.92 


3.7 


1 


94-110 


2275 


cNMP binding 


Cyclic nucleotide-binding domain 


1.5e-15 


57.4 


I 


126-216 


2275 


RasGEFN 


Guanine nucleotide exchange factor fo 


0.00023 


17.5 


1 


241-285 


2275 


Pseu avirulence 


Avirulence protein 


0.91 


1.9 


1 


272-285 


2275 


PDZ 


PDZ domain (Also known as DHR or 
GLGF 


2e-09 


35.0 


1 


361-412 


7276 


cytochrome^ 


Cytochrome c 


0.92 


3.7 


1 


94-110 


2276 


cNMP binding 


Cyclic nucleotide-binding domain 


1.5e-15 


57.4 


1 


126-216 


2276 


RasGEFN 


Guanine nucleotide exchange factor fo 


0.00023 


17.5 


1 


241-285 


2276 


Pseu avirulence 


Avirulence protein 


0.91 


1.9 


I 


272-285 


2276 


PDZ 


PDZ domain (Also known as DHR or 


2e-09 


35.0 


1 


361-412 
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Mnrlp] 


Description 


E_value 


Score 


Repeats 


Position 1 




' — 


GLGF 










9980 


Ricin B lectin 


QXW lectin repeat 


0.14 


8.2 




50-77 1 


9980 


MCR beta N I 


Methyl-coenzyme M reductase beta 
subun 


0.98 


2.1 




68-76 
26-52 1 


2281 


ArsA ATPase 


Anion-transporting ATPase _ 


0.54 


3.1 






9981 


Par A 


ParA family ATPase 


1.4e-25 


89.8 




111-202] 


9981 


SCF 


Stem cell factor 


l.le-27 


90.4 




206-2591 


9981 


FH2 


Formin Homology 2 Domain 


0.027 


8.8 




221-238 


9989 




Cadherin domain 


6e-20 


71.3 




3-81 


9989 




Cadherin domain 


5.9e-21 


74.8 




118-191 


998^ 


raHhprin 


Cadherin domain 


6e-20 


71.3 




3-81 


2283 


cadherin 


Cadherin domain 


5.9e-21 


74.8 




118-191 


99 8A 


PH 

in 


PH domain 


7.9e-10 


33.6 




4-92 1 


2284 


DUF1041 


Domain of Unknown Function 
(DUF1041) 


1.6e-07 


28.1 




206-237 


9985 


R*»nn1 Hinentase 


Renal dipeptidase 


9.3e-05 


15.8 




74-102 


99 8 f\ 




Amino acid permease 


7e-24 


89.2 




6-294 


9986 


Pnv 


Poxvirus protein 15 


0.24 


6.0 




85-102 


9986 


c*»rinp parhncnt 


Serine carboxypeptidase 


0.41 


2.3 




301-321 


2287 


THF DHG CYH 


Tetrahydrofolate dehydrogenase/cycloh 


2.3e-ll 


32.7 




62-123 


9987 


THF DHG CYH 

1 ill L/l J.VJ v> A *■ 

c 


Tetrahydrofolate dehydrogenase/cycloh 


6.1e-10 


36.6 




125-171 


9987 


FTHFS 


Formate-tetrahydrofolate ligase 


0 


1365. 
1 




302-92T] 


2288 


acid phosphat 


Histidine acid phosphatase 


0.038 


6.9 




391-407 


9988 


FMN red 


NADPH-dependent FMN reductase 


0.94 


3.3 




438-459 


2288 


acid phosphat 


Histidine acid phosphatase 


0.02 


7.9 




525-594 


2288 


Ribosomal L6 


Ribosomal protein L6 


0.21 


7.2 




774-814 


2290 


PI-PLC-X 


Phosphatidylinositol-specific 
phospholipase 


3.8e-14 


50.6 




1-33 j 


2292 


ABG transport 


AbgT putative transporter family 


0.81 


1.2 




21-34 


2292 


7tm 1 


7 transmembrane receptor (rhodopsin f 


1.6e-30 


90.1 




48-297 


2292 


HECT 


HECT-domain (ubiquitin-transferase) 


0.15 


5.5 




281-298 


2293 


tsp_3 


Thrombospondin type 3 repeat 


0.00058 


15.9 




13-25 I 


2293 


tsp 3 


Thrombospondin type 3 repeat 


0.0033 


13.4 


2 


36-48 


9993 


tsp_3 


Thrombospondin type 3 repeat 


0.0011 


15.0 


3 


51-66 


2293 


tsp 3 


Thrombospondin type 3 repeat 


0.00057 


15.9 


4 


74-86 \ 


2293 


tsp 3 


Thrombospondin type 3 repeat 


0.0015 


14.6 


6 


114-126 


990^ 


t<;n ^ 


Thrombospondin type 3 repeat 


0.03 


10.3 


7 


127-142 


2293 


TSPC 


Thrombospondin C-terminal region 


7.1e- 
176 


594.4 


1 


167-367 




\Ar\A\ 


Mndl familv 


0.68 


3.4 


1 


366-374 


2294 


Vps52 


Vps52/Sac2 family 


0.087 


3.9 


4 


154-183 




L/OmpicXl l 

D 


NADH:ubiquinone oxidoreductase 17.2 
k 


0.25 


6.1 




562-587 


2294 


mRNA triPase 


mRNA capping enzyme, beta chain 


0.33 


4.0 






2294 


DUF424 


Protein of unknown function (DUF424) 


0.79 


4.6 




1002- 
1017 


2295 


sodcu 


Copper/zinc superoxide dismutase 
(SOD 


1 


2.0 




10-23 


2295 


DapB C 


Dihydrodipicolinate reductase, C-term 


0.84 


4.5 




17-31 


2295 


PTR2 


POT family 


1.9e- 
104 


357.1 




82-475 


7296 


FH2 


Formin Homology 2 Domain 


0.0052 


11.5 


1 


98-144 
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2296 


NTP transf 2 


Nucleotidyltransferase domain 


0.049 


7.4 


1 


104-170 


2296 


FH2 


Formin Homology 2 Domain 


1.6e-05 


20.9 


2 


158-186 


2297 


zf-C2H2 


Zinc finger, C2H2 type 


0.01 


15.5 


1 


1-21 


2297 


XPA N 


XPA protein N-terminal 


0.51 


5.7 


1 


24-36 


2297 


TFIIS 


1/6 1 9T 31 39 


0.16 


7.4 


2 


27-37 


2297 


zf-C2H2 


Zinc finger, C2H2 type 


6.7e-06 


28.3 


2 


27-49 


2297 


XPA N 


XPA protein N-terminal 


0.49 


5.8 


2 I 


52-64 


2297 


TFIIS 


1/6 1 9 r. 31 39 


0.18 


7.2 


3 


55-65 


2297 


zf-C2H2 


Zinc finger, C2H2 type 


4e-06 


29.2 


3 


55-77 


2297 


TFIIS I 


1/6 1 9 f. 31 39 


0,51 


5.7 


4 


83-93 


2297 


zf-C2H2 


Zinc finger, C2H2 type 


2.7e-05 


25.9 


4 


83-105 ! 


2297 


zf-C2H2 


Zinc finger, C2H2 type 


7.8e-07 


32.1 


5 


111-133 


2297 


XPA N 


4/5 108 120.. 1 13 


0.45 


5.9 


5 


136-148 


2297 


eIF5 eIF2B 


Domain found in IF2B/IF5 


0.95 


3.5 


1 


139-149 


2297 


TFIIS 


1/6 1 9 r. 31 39 


0.069 


8.7 


6 


139-149 


2297 


Transoosase 12 


Transposase 


0.48 


3.6 


1 


139-165 


2297 


zf-C2H2 ! 


Zinc finger, C2H2 type 


6.6e-07 


32.4 


6 


139-161 


2298 


Sprouty 


Sprouty protein (Spry) 


1.2e-17 


55.0 


1 


70-107 


2299 


HAMP 


HAMP domain 


0.21 


7.3 


1 


9-42 


2299 


PA 


PA domain 


3.6e-19 


65.4 


1 


155-255 


2299 


Peptidase M28 


Peptidase family M28 


2e-118 


403.6 


1 


332-585 


2299 


Borrelia Hpo 


Borrelia burgdorferi virulent strain 


0.98 


2.5 


1 


591-604 


2299 


TFR dimer 


Transferrin receptor-like dimerisatio 


le-65 


228.5 


1 


597-739 


2300 


GvpG 


Gas vesicle protein G 


0.088 


6.7 


1 


43-75 


2301 


Sema 


Sema domain 


5.5e-05 


17.6 


1 


34-113 


2303 


ZZ 


Zinc ringer, ZZ type 


le-12 


48.2 


1 


5-50 




SoxD 


Sarcosine oxidase, delta subunit fami 


0.97 


4.2 


1 


79-86 




zf-C2H2 


Zinc finger, C2H2 type 


0.00067 


20.3 


1 


80-103 


2303 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.3 


3.6 


1 


95-115 


2303 


SPDY 


Domain of unknown function 
(DUF317) 


0.6 


4.4 


1 


119-133 


2303 


Dil9 


Drought induced 19 protein (Dil9) 


0.00056 


13.0 


1 


314-330 


2305 


ig 


Immunoglobulin domain 


1.7e-05 


23.4 


I 


37-114 


2305 


Phape cao E 


Phage major capsid protein E 


0.79 


2.8 


I 


128-137 


2306 


MHCJ 


Class I Histocompatibility antigen, d 


9.2e- 
142 


481.1 


1 


32-210 


2306 


DUF497 


Protein of unknown function (DUF497) 


0.2 


6.7 


1 


50-63 






Immunoglobulin domain 


7.9e-09 


35.9 


1 


227-292 


2306 


DUF395 


YeeE/YedE family (DUF395) 


0.19 


7.2 


1 


317-342 




T RP RPT PPTP 

c 


T RP / RPI / PPTP familv C-terminal 

1_/D 1 / Df L 1 L J. Idillll jf , ivl Allium 

do 


5.8e-05 


18.1 


1 


15-98 


T51Y7 


t no upt rpTP 
Lor Br I willr 

c 


T RP / RPT / PPTP family C-terminal 

do 


0.7 


3.5 


2 


113-138 


2308 


T DO rjDT rDTD 

c 


T RP / RPT / PPTP familv P-tprminal 

do 


5.8e-05 


18.1 


1 


15-98 


2308 


LBP BPI CETP 
C 


LBP / BPI / CETP family, C-terminal 
do 


0.7 


3.5 


2 


113-138 


2309 


LBP BPI CETP 
C 


LBP / BPI / CETP family, C-terminal 
do 


5.8e-05 


18.1 


1 


15-98 


2309 


LBP BPI CETP 
C 


LBP / BPI / CETP family, C-terminal 
do 


0.7 


3.5 


2 


113-138 


2310 


LBP BPI CETP 
C 


LBP / BPI / CETP family, C-terminal 
do 


5.8e-05 


18.1 


1 


15-98 


7310 


LBP BPI CETP 


LBP / BPI / CETP family, C-terminal 


0.7 


3.5 


2 


113-138 
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C 


do 










2311 


Secretograrun_V 


XToi irrtan Ar\r>ririe* T\Tf\ti*lT\ 7R9 nrPf*l ITS fir 

iNeuroenuocrinc proiciii id** piv^/uioui 


2 Re- 

*r«Ov 

136 


462.9 


1 


10-214 


23 12 


Cyto heme lyase 


Cytochrome c/cl heme lyase 


0.82 


1.8 


1 


97-120 


23 13 


P Mr zz ciauam 


DN/TP 99/RNyfP/MP9n/PlanHin familv 


6.9e-46 


159.3 




8-185 


2313 


Acyl transf 3 


A fr-o n c P^ra C p» Familv 

/\cyiu«nsicraoe umiiiy 


0.12 


6.3 


1 


110-155 


2314 


Cna B 


cna protein t>-type aorndiu 


0 17 

v. 1 / 


5.9 


! 


52-85 


2314 


PDZ 


zvJt-t uomam ^/viau kjiuwu «o L'tu\ vi 


8.1e-12 


43.4 


1 


52-130 


91 1 < 
ZJ ID 


riU 


PTmcnhntvrncine interaction domain 
(PT 

V rA 


3.3e-47 


160.5 


I 


46-172 


2317 


pkinase 


Protein kinase domain 


8.3e-74 


255.4 


\ 


22-282 


11 1 0 
ZJlo 


lipocalin 


T innralin / pvtnQolir fattv-acid Hindinc 

JL«lpUL>allIl / UYfcUoUHv JLavijr »viu i/iuuiug 

P 


2.3e-42 


150.9 




58-206 


911R 


1 nauui 


Triahin 


0.0018 


12. i 


I 


139-156 


91 IO 


lacianiabc d 


Metal 1 o-beta-1 act a m as e sunerfamil v 


2.3e-06 


24.6 


I 


26-74 1 


9i9n 


dnncxin 


Annexin 


2.5e-05 


21.2 




1-20 


9790. 

ZjZU 


annexin 


AnnPYin 


Lle-29 


107.6 


2 


26-92 


9190. 

ZJZU 


annexin 


Annevin 


9.7e-28 


100.7 


3 


109-176 


ZjZU 


annexin 


AnnpYin 


2.8e-33 


120.4 


4 


185-251 


9191 


<3MF 


SnHinm* neurotransmitter svmoorter 
fam 


9.5e- 
260 


873.0 


1 


38-417 


9191 


A TP-qi il furvl a°.e 


ATP-sul f ury lase 


0.28 


3.8 


1 


42-64 


9191 


HI IFQOO 


Protein of unknown function (DUF900) 


0.98 


2.8 


1 


251-263 


9191 


Ol vniran 


Glypican 


2.4e-60 


201.2 


1 


3-115 


9194 


PAP acjinc 


P AP/25 A associated domain 


1.6e-14 


51.8 


1 


274-333 


9194 


T Qrwfi nri ?m ofo op 


Isochorismatase family 


0.49 


4.1 


I 


484-520 


9196 


^Ipp91 trunk 


Sec23/Sec24 trunk domain 


0.47 


4.0 


1 


22-33 


2326 


Hydrolase 


haloacid dehalogenase-like hydrolase 


0.77 


3.7 




26-56 


9197 


Q*»r»91 trnnlr 


Sec23/Sec24 trunk domain 


0.47 


4.0 


1 


22-33 


ZJZ/ 


jriyuruid.bc 


haloacid dehaloeenase-like hydrolase 


0.77 


3.7 




26-56 


2328 


A2M 


Alpha-2-macrogIobulin family 


6.3e-23 


75.5 




4-86 


Z3zy 


A 9 iVf 


Alfiha-9-rnacroplnhulin familv 


6.3e-23 


75.5 


1 


4-86 


9iin 

Z3oU 




A1nha-9-macroff1obulin familv 


6.3e-23 


75.5 


I 


4-86 


Zjj I 


A9M 


Alnha-2-macroplobulin familv 


6.3e-23 


75.5 


I 


4-86 


9119 




A 1 nh a-2-macr oel ob ulin familv 


6.3e-23 


75.5 


I 


4-86 


9111 




farhnY vl esterase 


4.3e-42 


142.8 


I 


8-142 


9111 


A9U XT 


Alnnn-9-marrop1nhulin familv N- 

T\l\JHCi £4 lllOvlvglUUUlUl X til 1111 Jf A 1 

termina 


0.83 


2.3 


I 


12-28 


9114 




EGF-like domain 


0.017 


11.7 


1 


5-26 


9114 


TIT 


Trvnsin Inhibitor like cysteine rich 
doma 


0.85 


3.5 


I 


10-26 ! 


9116 


V^UlUlla liut 


Coronavirus non-structural protein NS 


0.47 


3.5 


I 


20-43 


2336 




Immunoglobulin domain 


0.052 


10.4 


1 


57-119 


2336 


fh3 


Fibronectin type III domain 


2.4e-16 


58.5 




145-231 


2339 


Keratin B2 


Keratin, high sulfur B2 protein 


le-19 


69.2 




21-145 


2340 


Keratin B2 


Keratin, high sulfur B2 protein 


le-19 


69.2 




21-145 


2341 


Keratin B2 


Keratin, high sulfur B2 protein 


le-19 


69.2 




21-145 


7342 


abhydro lipase 


ab-hydrolase associated lipase region 


1.9e-32 


117.8 




87-157 


2342 


abhydrolase 


alphafteta hydrolase fold 


9.5e-19 


67.6 




171-448 


2343 


abhydro J ipase 


ab-hydrolase associated lipase region 


1.9e-32 


117.8 




87-157 


7343 


abhydrolase 


alpha/beta hydrolase fold 


9.5e-19 


67.6 




171-448 


2344 


7tm_3 


7 transmembrane receptor 
(metabotropic 


0.75 


3.2 




26-47 
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7tm 3 


7 transmembrane receptor 
(metabotropic 


0.00057 


14.4 


2 


67-111 


2344 


Condensation 


Condensation domain 


0.36 


4.2 




159-171 




7tm 3 


7 transmembrane receptor 
(metabotropic 


1.4C-05 


20.1 




170-273 


2345 


GASA 


Gibberellin regulated protein 


0.35 


1.3 




25-54 


2345 


lectin c 


Lectin C-type domain 


1.3e-25 


95.2 




103-216 


2346 


GASA 


Gibberellin regulated protein 


0.35 


1.3 




25-54 


2346 


lectin c 


Lectin C-type domain 


1.3e-25 


95.2 




103-216 


2347 


7tm 1 


7 transmembrane receptor {rhodopsin f 


2.6e-50 


149.5 




99-348 


2347 


endotoxin N 


delta endotoxin, N-terminal domain 


0.87 


3.6 




253-283 


2147 


Pox D2 


Pox virus D2 protein 


0.93 


1.2 




366-379 


2347 


7tm 1 


7 transmembrane receptor (rhodopsin f 


9.6e-48 


141.7 




417-666 


2347 


PAZ 


PAZ domain 


0.48 


4.7 




540-567 


2348 


Hanta G2 


Hantavirus glycoprotein G2 


0.098 


4.8 




84-112 


2350 


An_peroxi dase 


Animal haem peroxidase 


l.3e-91 


311.6 




2-232 


2350 


7tm 1 


7 transmembrane receptor (rhodopsin f 


0.22 


2.7 




24-32 


2350 


Peptidase CI 


Papain family cysteine protease 


0.76 


2.1 




117434 


2351 


An peroxidase 


Animal haem peroxidase 


1.3e-91 


311.6 




2-232 


2351 


7tm 1 


7 transmembrane receptor (rhodopsin f 


0.22 


2.7 




24-32 


2351 


Peptidase CI 


Papain family cysteine protease 


0.76 


2.1 




117-134 


2352 


Arch fla DE 


Archaeal flagella protein 


0.42 


5.0 




86-99 


2353 


UBX 


UBX domain 


0.36 


6.0 




141-159 


2353 


FTCD C 


Formiminotransferase-cyclodeaminase 


0.21 


6.0 




188-218 


2353 


3H 


3H domain 


0.46 


6.2 




248-260 


2354 


Torsin 


Torsin 


3e-189 


638.8 




17-288 


2354 


2 5 iigase 


2\5' RNA ligase family 


0.13 


7.6 




101-133 


2354 


DUF254 


SAND family protein 


0.22 


3.0 




110-129 


2155 


SPX 


SPX domain 


0.84 


1.0 




66-86 


2359 


DUF895 


Eukaryotic protein of unknown 
function 


0.68 


4.1 




183-199 


2360 


RWD 


RWD domain 


9.8e-40 


142.2 




17-131 


2360 


globin 


Globin 


0.048 


8.6 




94-126 


2360 


eRFl 2 


eRFl domain 2 


0.72 


4.1 




120-133 


2360 


zf-C3HC4 


Zinc finger, C3HC4 type (RING ringer) 


7.5e-10 


27.9 




141-207 


2360 


DNA liease ZB 
D 


NAD-dependent DNA ligase C4 zinc 
finge 


0.37 


5.8 




202-213 


2360 


zf-MIZ 


MIZ zinc finger 


0.28 


4.6 




203-213 


2360 


ApoA-II 


Apolipoprotein A-II (ApoA-II) 


0.94 


3.6 




267-278 


2361 


TMS TDE 


TMS membrane protein/tumour 
differentia 


0.048 


5.2 




33-63 


2361 


suear tr 


Sugar (and other) transporter 


1.5e-05 


19.2 




65-138 


2362 


iff 


Immunoglobulin domain 


0.00079 


17.2 




42-98 


2363 


lg 


Immunoglobulin domain 


0.00079 


17.2 




42-98 


2364 


IBN NT 


Importin-beta N-terminal domain 


6.3e-16 


58.6 




65-109 


2365 


HIM 


Haemagluttinin motif 


0.18 


7.7 




375-391 


2365 


Fascin 


Fascin protein 


0.29 


0.7 




808-818 


2370 


DUF350 


Domain of Unknown Function 
(DUF350) 


0.86 


5.1 




106-128 


7372 


Apolipoprotein 


Apolipoprotein A1/A4/E family 


0.95 


3.4 




54-78 


2372 


F5 F8 type C 


F5/8 type C domain 


2.2e-36 


111.6 




84-171 


7373 


Apolipoprotein 


Apolipoprotein A1/A4/E family 


0.95 


3.4 




54-78 


2373 


F5 F8 type_C 


F5/8 type C domain 


2.2e-36 


111.6 




84-171 


7374 


Ribosomal S3 C 


Ribosomal protein S3, C-terminal 


0.98 


3.4 




65-71 
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domai 










2374 


Ricin B_lectin 


QXW lectin repeat 


0.13 


8.4 


1 


175-204 


2374 


Ricin B lectin 


OXW lectin repeat 


0.00073 


16.5 


2 


266-304 


2375 


p kinase 


Protein kinase domain 


1.2e-ll 


41.7 


1 


38-119 


2377 


PH 


PH domain 


0.023 


9.0 


1 


43-81 


2377 


BTK 


BTK motif 


1.9e-06 


26.9 


] 


105-141 




FfrF 

J-VVJi j 


EGF-like domain 


0.047 


10. 1 




1-17 


9170 
l*j fy 


EGF 


EGF-like domain 


3.7e-07 


28.6 




33-68 


9170 


EB 


EB module 


0.73 


4.1 


1 


39-68 


9180 


M tail 


Myosin tail 


0.32 


4.7 


1 


192-222 


91R0 


OU1 i-t i 


Survival protein SurE 


0.68 


2.6 


1 


317-330 


9180 


Pox Al 1 

i via. jry. l a 


Poxvirus Al 1 Protein 


0.17 


3.2 


1 


364-382 


918S 


7tm 1 


7 transmembrane receptor (rhodopsin f 


8.6e-47 


138.9 


1 


83-332 


9186 


7hn 1 

/111 I *■ 


7 transmembrane receptor (rhodopsin f 


8.6e-47 


138.9 


1 


83-332 




Pa 1c von 


Dl dopamine receptor-interacting 
protein 


3.7e-42 


136.9 


1 


1-66 


2389 


ILl 


Interleukin-1/18 


2.6e-23 


83.4 


1 


79-180 


2390 


filament 


Intermediate filament protein 


7.3e-68 


235.6 


1 


2-189 


2390 


K-box 


K-box region 


0.11 


7.0 




11-29 


9100 


hzrp 


bZIP transcription factor 


0.2 


7.0 


-j 


49-86 


2390 


Ribosomal L29 


Ribosomal L29 protein 


0.71 


5.2 


_1 


104-130 


9109 
Loy I* 


IVlgpv^ 


MgpC protein precursor 


0.99 


2.8 




1-27 


9101 


lliaiiiv/iii 


Intermediate filament protein 


1.8e-14 


54.6 


I 


9-80 


9104 


Pentidase M10 


Matrixin 


4.5e-47 


166.6 


1 


11-103 


9104 


Ppntidase M10 


Matrixin 


9.2e-17 


64.0 




107-145 


910^ 


PpntidaQP \41 0 
rcutiuasu lY-i i v/ 


Matrixin 


4.5e-47 


166.6 


1 


11-103 j 


2395 


Peptidase M10 


Matrixin 


9.2e-17 


64.0 




107-145 


91Q6 


our u 


Suppressor of fused protein (SUFU) 


0 


1218. 
3 


1 


3-484 


9107 


T BP BPI CETP 

1 'HI iJl X \_/U X X 


LBP / BPI / CETP family, N-terminal 
doma 


3.4e-20 


69.4 


1 


38-103 


9198 

l, jyo 


MCPsienal 


Methyl-accepting chemotaxis protein ( 


0.21 


4.1 


1 


363-379 


2399 


MCPsignal 


Methyl-accepting chemotaxis protein ( 


0.21 


4.1 




363-379 


9402 


DUF846 


Eukaryotic protein of unknown functio 


4.3e-05 


15.0 


i 


63-93 


9405 


UK 


Virulence determinant 


0.083 


7.0 


1 


48-72 


940S 


TIP 120 


TBP (TAT A-binding protein) -interact! 


0 


3271. 
6 


1 


59-1252 


2405 


HEAT 


HEAT repeat 


0.093 


8.3 


2 


282-320 


2405 


HEAT 


HEAT repeat 


0.04 


9.5 


3 


377-398 


2405 


Armadillo set? 


Armadillo/beta-catenin-like repeat 


0.2 


8.0 


2 


716-755 


2406 


lectin c 


Lectin C-type domain 


2.5e-16 


64.4 


1 


168-274 


2412 


PTE 


Phosphotriesterase family 


8.2e- 
207 


697.2 


1 


38-380 


2412 


pntR 


Bacterial regulatory proteins, gntR f 


0.17 


7.0 


i 


141-160 


2414 


filament 


Intermediate filament protein 


0.28 


5.0 




199-228 


2414 


Transposase 8 


Transposase 


0.57 


5.0 




200-220 


2414 


DUF972 


Protein of unknown function (DUF972) 


0.76 


4.2 




201-245 


2414 


Rop 


Rop protein 


0.55 


3.6 




242-249 


2414 


MoaE 


MoaE protein 


0.18 


7.2 




467-480 


2414 


WH2 


WH2 motif 


0.14 


8.9 




468-485 


2419 


Pox int trans 


Poxvirus intermediate transcription fa 


0.092 


5.7 




119-147 


2419 


ABA WDS 


ABA/WDS induced protein 


0.81 


4.5 




185-201 


2419 


DUF738 


Protein of unknown function (DUF738) 


0.89 


3.3 




297-316 


2419 


IpaB EvcA 


IpaB/EvcA family 


0.65 


3.8 




460-485 
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P^I Dearhxvlase 


Phnsnhatidvl serine deearboxvlase 

A ll\Jo|Jil(lllLl jr loi^l lilts Uwvcu \j\jt\.y i"u y 


3.9e-60 


209.9 


I 


80-323 


9499 




Unrhararteriyed RCR 0001481 


0.5 


3.4 




138-154 


9491 


ADP PFK GK 


A DP-sneo ific 

Ph nsnhofmetoki nase/Gluco 


4.1e- 
208 


701.5 




6-408 


9491 


Mannitol dh 


Mannitol dehvdroffenase 


0.052 


6.8 


I 


310-329 


9494 


ASFV L11L 


African swine fever virus ( ASFV) 

i ill ivull J n lllw ivi t mm wu y * Lw» *■ * ^ 

L11L 


0.89 


3.1 


I 


8-19 


2428 


Ribosomal S25 


S25 ribosomal protein 


0.00053 


15.1 


I 


12-43 


2429 


ank 


Ankyrin repeat 


0.011 


12.7 


I 


1-18 


2429 


ank 


Ankyrin repeat 


0.0036 


14.4 


2 


19-51 


2429 


ank 


Ankyrin repeat 


1.5e-ll 


44.5 


3 


52-84 


2429 


ank 

aim. 


Ankvrin reoeat 

X U.UV Tllll 1 wL^ VC*V 


1.2e-08 


34.1 


4 


85-117 


9499 


an If 

Aim 


Ankvrin reoeat 


3.3e-08 


32.5 


5 


118-150 


9499 


ank 


Ankyrin repeat 


3.4e-ll 


43.2 


6 


151-183 


9499 


ank 


Ankvrin reoeat 


1.3e-08 


33.9 


7 


184-217 


2429 


ank 


Ankyrin repeat 


0.0027 


14.9 


8 


218-250 


9490 


sink 


Ankvrin reneat 


8.5e-08 I 


31.1 


9 


251-283 


9499 


sink 


Ankvrin reoeat 


0.013 


12.4 


10 


284-308 


2429 


ank 


Ankyrin repeat 


8.3e-08 


31.1 




335-367 


949Q 


ank 1 


Ankvrin reneat 


l.le-09 


37.8 


12 


368-400 


9490 




Ankvrin reneat 


6.9e-07 


27.8 


13 


401-461 


2429 


endonuclease 7 


Recombination endonuclease VII 


0.034 


9.6 




417-441 


9A9Q 


ank 


AnV^i/rin reneat 


0.0047 


14.0 


14 


462-485 




trypsin 


Trvncm 


1.5e-23 


72.8 




61-237 






PD7 dnmain ( Also known as DHR or 

GLGF) 


5.1e-08 


30.0 


I 


285-339 




vwc 


vnn Willehrand factor tvoe C domain 


5.4e-05 


19.0 


I 


66-105 


9419 


VTpth vl tran sfD 1 2 

lVJ.VtILjr lit Oil 91.1^ lx» 


D12 class N6 adenine-soecific DNA 
met 


1.4e-36 


128,4 


I 


39-163 


2432 


R ihosorn al T , 1 


Ribosomal protein Llp/LlOe family 


0.57 


1.6 




99-117 


2433 


linocalin 


Lipocalin / cytosolic fatty-acid binding 
or 


4.7e-ll 


42.2 


1 


12-80 


2434 


tRNA anti 


OB-fold nucleic acid binding domain 


2.7e-15 


56.4 


1 


71-145 


2434 


tRNA-svnt 2 


tRNA synthetases class II (D, K and N 


2.2e-59 


207.4 


1 


162-410 


2434 


Transglutamin C 


Transglutaminase family, C-terrninal i 


0.79 


4.1 


1 


229-256 


2434 


RNA helicase 


RNA helicase 


0.13 


5.7 


1 


266-308 


2435 


FAD binding 2 


FAD binding domain 


L6e-53 


181.4 


1 


22-117 


2436 


RasGEF 


RasGEF domain 


6.8e-18 


69.6 




35-115 


2437 


KH 


KH domain 


3.8e-l7 


61.6 


I 


78-124 


2437 


Periola BP 2 

X VI 1UIU A—* Xi 


Periplasmic binding protein 


0.71 


3.5 


I 


116-132 


2437 


KH 


KH domain 


2.4e-10 


38.1 




162-189 


2439 


transket pyr 


Transketolase, pyridine binding domai 


1.6e-51 


176.9 


-j 


76-254 


9410 


DT rF094 


Raeterial nrotein of unknown function 


0.88 


2.8 




77-98 


9410 


Inni(Tnininp A 


Tndionirline svnthase A like orotein 


0.51 


4.4 


I 


233-247 


2439 


transketolase C 


Transketolase, C-terminal domain 


9.7e-42 


137.2 


1 


272-398 


2440 


Calsequestrin 


Calsequestrin 


8.5e- 
292 


979.5 




42-427 


2440 


thiored 


Thioredoxin 


0.057 


9.0 




160-189 


2441 


Bacillus PapR 


Bacillus PapR protein 


0.68 


3.6 




62-77 


2441 


arf 


ADP-ribosylation factor family 


0.76 


2.9 




290-312 


2441 


RNA_capsid 


Calicivirus putative RNA 
polymerase/ca 


0.65 


1.9 




346-355 


2442 


ig 


Immunoglobulin domain 


2.7e-09 


37.6 




35-112 


2442 


Relaxase 


Relaxase/Mobilisation nuclease domain 


0.91 


3.4 




71-90 
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OAA1 
Z44Z 


\rit*iic P—f*Oat 


Viral coat orotein 


0.7 


4.0 


1 


140-152 






Immunoglobulin domain 


9.8e-07 


28.1 


2 


156-230 




r»ci/Thprin 


Cadherin domain 


0.045 


8.9 


1 


77-106 


Z443 




Cadherin domain 


L4e-14 


52.6 


2 


150-245 


2445 


caanenn 


PnHhprin Hnmain 


4.8e-24 


85.5 


3 


259-350 


1AAZ 

2445 




PaHhprin domain 


3.6e-14 


51.1 


4 


364-455 


2445 


cadherin 


Cadherin domain 


2.9e-22 


79.4 


5 


469-565 


2445 


nerna^riiir o 


T-fomaao'iiiti'nin domain of 

VlH Pm 51 0 CTl 1 )tini 
ua.eiiiu.ggiui.iiii 


0.25 


4.6 


1 


515-531 


2445 


cadherin 


Cadherin domain 


8.2e-14 


49.9 


6 


594-677 


a A-t 

2447 


bnlJv 


QViTIf Hnmain 


0.92 


3.9 1 


1 


74-82 


244o 


„r« noun 

Zt-UZtiZ 


7inr fSncrpr tvne 
/-(li io linger, V/Lnz. iy^c 


0.00076 


20.0 


2 


50-73 


1/1/10 

244o 


Zl-v^ZrlZ 


/-.inc Linger, v^^nx. Lyye 


0.036 


13.3 


3 


80-106 


244o 


ZI-L,2riZ 


7inr> finopr tvnp 


0.00095 


19.6 


5 


198-221 


oa <n 
245 U 


Aipna l>_uicos 


A ItVha-T -fiiPOQiHa<5ft 


0.018 


8.4 


1 


10-34 


24 j 1 




Trnnclatinnall v r on trolled tumour 
■nrntpin 


7.3e-13 


42.0 


1 I 


20-54 


2452 


Herpes_gG 


Glycoprotein GG/GX 


0.39 


2.9 


1 


29-54 


1/1 ^1 

2402 


L/sieoponnn 


Octf*nnnntin 
vjoieupuiiiui 


Lle- 
128 


410.7 


1 


42-218 


Z40Z 


win Mi 
riu ivn 


Tnflnpn7Ji Matrix nrotein i^Ml^ 


1 


3.0 


1 


52-65 


Z45 j 


serum 


Sprnin fsprine nrotease inhibitor^ 


0.99 


2.0 


1 


68-92 


Z434 


l-IATP«acp o 


T-fi uridine kinase- DNA trvrase and 


3.8e-15 


54.5 


1 


92-240 


Z'fj't 


Pew TSJ9T 


Poxvirus N2L Drotein 


0.18 


5.3 


1 


162-176 


2454 


DNA gyraseB 


DNA gyrase B 


4.1e-57 


199.9 


1 


286-446 


1/1 </l 

Z454 


rOKJ. IN 


P#»o+r*i/*tinn pnHrvniiflpaSft PokT recO0Tl 


0.12 


6.3 


1 


530-539 


2454 


DNAjopoisoIV 


DNA gyrase/topoisomerase IV, subunit 


7.6e- 
190 


610.8 


1 


729- 
1196 


2454 


DUF188 


Uncharacterized BCR, Yail/YqxD 
iamiiy 


0.025 


8.2 


1 


1171- 
1197 


2456 


PCI 


PCI domain 


0.27 


6.1 


1 


66-93 


1/1 C7 

2457 




\_/ 1 q aouioin 


6.3e-06 


23.8 


1 


98-138 


2458 


BTB 


D 1 Df ±\JZj UOrudlll 


1.9e-17 


64.4 


1 


62-124 


2459 


LBP_BPI_CBTP 


LBP / BPI / CETP family, N-terminal 
doma 


3.4e-20 


69.4 


1 


38-103 


2460 


LBP_BPI_CETP 


LBP / BPI / CETP family, N-terminal 
doma 


3.4e-20 


69.4 


1 


38-103 


1/1 £1 

24ol 


T "DTJ T5DT PRTP 
Loi^or l_V/tS 1 Jr 


T T*P / RPT / PPTP familv N-terminal 


3.4e-20 


69.4 


1 


38-103 


1/1 <0 
2402 


JJUr4Uo 


ri/vmain nf Unknown Function 
/Tit TPAOR^ 


le-11 


42.8 


1 


1-43 


240Z 




Prntpin of unknown flinction DUF584 


0.67 


2.3 


1 


224-250 


OA&A 
2404 


Securin 


Q<=» r . 1 trin oictpr-chromatifl senaration 

inhibito 


1 


2.9 


1 


19-34 


2465 


Nu£2 


Nuf2 family 


3.3e- 
104 


356.3 


1 


6-153 


2465 


Corona NS2A 


Coronavirus NS2A protein 


0.42 


2.3 


1 


133-139 


7,465 


Syntaxin 


Syntaxin 


0.31 


6.3 


1 


142-242 


2465 


HR1 


Hrl repeat 


0.099 


7.1 


1 


192-219 


2465 


LEA 


Late embryogenesis abundant protein 


0.79 


5.0 


I 


254-279 


2465 


Mob Pre 


Plasmid recombination enzyme 


0.97 


1.9 


I 


366-376 


7465 


G-gamma 


GGL domain 


0.08 


6.9 


1 


403-424 


2465 


0KR_DCJ_N 


Orn/Lys/Arg decarboxylase, N- 
terminal 


0.19 


4.3 


1 


426-450 
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2466 


pkinase 


Protein kinase domain 


0.036 


7.9 




62-123 


2466 


HEAT 


HEAT repeat 


0.13 


7.8 


i 


335-373 


2467 


UreE C 


UreE urease accessory protein, C-termi 


0.09 


7.9 


1 


119-140 


2467 


Pox A type inc 


Viral A-type inclusion protein repeat 


0.59 


6.3 




194-215 


2470 


DUF563 


Protein of unknown function (DUF563) 


0.86 


2.4 


I 


36-47 


2471 


Fumarate red D 


Fumarate reductase subunit D 


0.28 


5.3 


1 


79-103 


2473 


zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


2.9e-06 


17.8 


1 


59-96 


2473 


IBR 


IBR domain 


0.0064 


10.5 


1 


70-114 




zf-C3HC4 


Zinc finger, C3HC4 type (RING finger) 


0.029 


6.4 




124-144 


2474 


A a trans 

l\Cl H alio 


Transmembrane amino acid transporter 
P 


1.4C-41 


148.3 


1 


72-276 




An trans 

r\.CL n alio 


Transmembrane amino acid transporter 

D 


1.4e-41 


148.3 


1 


72-276 


2477 


LRR 


Leucine Rich Repeat 


0.093 


8.7 


1 


30-51 


2477 


LRR 


Leucine Rich Repeat 


0.57 


6.0 




56-77 


2478 


SAB 


SAB domain 


0.65 


5.3 


1 


57-82 


2478 


RPE65 


Retinal pigment epithelial membrane 
prote 


4.3e-27 


91.2 


1 


85-202 


2479 


SAB 


SAB domain 


0.65 


5.3 


1 


57-82 


2479 


RPE65 


Retinal pigment epithelial membrane 
prote 


4.3e-27 


91.2 


1 


85-202 


2480 


Pox A typeinc 


Viral A-tvpe inclusion protein repeat 


0.42 


6.8 


1 


18-40 


2480 


spectrin 


Spectrin repeat 


0.61 


4.8 


1 


18-45 


2481 


CD34 antigen 


CD34 antigen protein 


0.88 


0.3 


I 


6-34 


2481 


DUF999 


Protein of unknown function (DUF999) 


0.28 


5.1 


1 


14-35 


2481 


BCL N 


BCL7, N-terminal conserver region 


0.39 


6.1 


I 


174-195 


2481 


serpin 


Serpin (serine protease inhibitor) 


3.7e-24 


81.8 


1 


431-546 


2482 


PMP22 Claudin 


PMP-22/EMP/MP20/Claudin family 


3.7e-89 


306.4 


1 


40-218 


2482 


mce 


mce related protein 


0.74 


4.3 


1 


159-179 


2483 


PAP2 


PAP2 superfamily 


4.5e-15 


54.4 


1 


24-159 


2488 


SMC C 


SMC family, C-terminal domain 


1.6e-17 


60.6 


1 


418-475 


2488 


SMC C 


SMC family, C-terminal domain 


l.le-43 


150.2 


2 


477-540 


2488 


Armadillo_seg 


Armadillo/beta-catenin-like repeat 


4.6e-14 


53.0 


2 


551-591 


2488 


Arraadillo_seg 


Armadillo/beta-catenin-like repeat 


1.4e-08 


33.5 


3 


594-634 


2489 


IER 


Immediate early response protein (IER) 


0.063 


3.8 


1 


194-206 


2490 


disintegrin 


Disintegrin 


3.3e-36 


123.1 


1 


4-79 


2490 


EGF 


EGF-like domain 


0.0023 


14.8 


1 


231-258 


2499 


ARPF 


Aromatic-Rich Protein Family 


L4e-10 


36.3 


1 


89-234 


2502 


Pafl 


Pafl 


2.3e-17 


65.0 




1-61 


2502 


iff 


Immunoglobulin domain 


0.015 


12.4 


1 


68-113 


2507 


ank 


Ankyrin repeat 


0.025 


11.4 




186-204 


2504 


PH 


PH domain 


0.028 


8.7 


1 


61-153 


2504 


DAGKc 


Diacylglycerol kinase catalytic domain 


0.00051 


15.4 


4 — 


161-213 




i or 


Immunoglobulin domain 


0.069 


9.9 




48-120 


2505 


ig 


Immunoglobulin domain 


6.5e-09 


36.2 




161-219 


2506 


ig 


Immunoglobulin domain 


0.069 


9,9 




48-120 


7506 


ig 


Immunoglobulin domain 


6.5e-09 


36.2 




161-219 


2508 


7tmJ 


7 transmembrane receptor (rhodopsin 
famil 


8.2e-29 


85.0 




49-179 


2508 


7tm_l 


7 transmembrane receptor (rhodopsin 
famil 


1.6e-13 


39.1 


2 


210-267 


2517 


Acyl-CoA_dhJM 


Acyl-CoA dehydrogenase, middle 
domain 


0.0071 


11.7 


1 


99-136 


7517 


Acyl-CoA dh 


Acyl-CoA dehydrogenase, C-terminal 


6.7e-50 


175.9 


1 


415-566 
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Qoma 












Acvl-CoA dh M 


Arvl-f^nA dphvdrnppnase middle 
domain 


0.0071 


11.7 


1 


99-136 


2518 


Acvl-CoA dh 


Acvl-CoA dehvdroffenase C-terminal 
doma 


6.7e-50 


175.9 


1 


415-566 


2519 


Cation efflux 


Cation efflux familv 


3e-09 


34.4 


i 


33-109 


2520 


CaMBD 


Calmodulin binding domain 


0.074 


7.8 


I 


451-467 


2520 


IO 


IO calmodulin-bindinff motif 


L3e-05 


22.1 


2 


473-493 


2520 


IO 


IO calmodulin-bindinff motif 


1.6e-05 


21.8 


3 


532-552 


2524 


PAP2 


PAP2 superfamily 


7.6e-19 


67.8 


I 


39-151 


2525 


PAP2 


PAP2 sunerfamilv 


7.6e-19 


67.8 


1 


39-151 


2529 


LRR 


Leucine Rich Reneat 


0.00025 


17.3 


I 


201-226 


2529 


LRR 


Leucine Rich ReDeat 


0.0019 


14.3 


2 


227-246 


2529 


LRR 


Leucine Rich Reneat 


0.13 


8.1 


4 


272-297 


2529 


LRR 


Leucine Rich Reneat 


0.00025 


17.3 


5 


298-317 


2529 


LRR 


leucine Rich Reneat 


5.2e-05 


19.6 


6 


319-342 


2529 


LRR 


Leucine Rich Reneat 


0.37 


6.6 


7 


343-368 


2530 


iff 


Immunofflobulin domain 


0.26 


7.8 


1 


55-122 


2530 


iff 


Immunoglobulin domain 


0.0043 


14.5 


2 


162-220 


2530 


iff 


Immunoglobulin domain 


0.00023 


19.2 


3 


267-321 


2531 


iff 




Immunoglobulin domain 


0.26 


7.8 


1 


55-122 


2531 


iff 


Immunoglobulin domain 


0.0043 


14.5 


2 


162-220 


2531 


iff 


Immunoglobulin domain 


0.00023 


19.2 


3. 


267-321 


2532 


tsp 1 


Thrombospondin type 1 domain 


2.9e-07 


25.9 


1 


59-103 


2533 


Guanylin 


Guanylin precursor 


0.0007 


9.1 


1 


12-35 


2533 


Apo-CII 


Apolipoprotein C-II 


3.4e-57 


200.2 


1 


34-111 


2534 


Guanvlin 


Guanylin precursor 


0.0007 


9.1 


1 


12-35 


2534 


Aoo-CII 


Apolipoprotein C-II 


3.4e-57 


200.2 


1 


34-111 


2536 


zf-C2H2 


Zinc finger, C2H2 type 


0.0012 


19.3 


1 


279-301 


2536 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e-06 


30.3 


2 


307-329 


2536 


zf-C2H2 


Zinc finger, C2H2 type 


0.086 


11.7 


3 


337-355 


2540 


FA desaturase 


Fatty acid desaturase 


5.1e-42 


145.2 


1 


8-159 


2541 


maseH 


RNaseH 


7.1e-16 


53.4 


1 


86-184 


2541 


MutSJII 


MutS domain HI 


4.2e-06 


22.9 


1 


253-277 


2541 


MutS V 


MutS domain V 


6e-164 


543.6 


1 


282-517 


2542 


iff 


Immunoglobulin domain 


7.6e-08 


32.2 


1 


1-57 


2542 


fh3 


Fibronectin type III domain 


2.8e-16 


58.3 


1 


79-165 


2543 


MAM 


MAM domain 


1.5e-43 


154.8 


1 


3-102 


2544 


kazal 


Kazal-type serine protease inhibitor 
domain 


7,7e-06 


25.8 


1 


40-87 | 


2544 


iff 


Tmmiinofflohulin domain 


4.1e-07 


29.5 


1 


105-174 


2545 


RNA helina^p 


RNA helieasB 


0.031 


7.9 


1 


85-112 


9545 


A TP-hind 


r^nn<5prvpH hvnothptfral ATP hindinff 
prote 


0.055 


7.3 


1 


90-103 


2547 




Immunoglobulin domain 


0.015 


12.4 


1 


10-28 


2547 


ig 


Immunoglobulin domain 


0.098 


9.4 


2 


72-98 


2549 


serpin 


Serpin (serine protease inhibitor) 


5.4e-18 


60.8 


1 


68-112 


2551 


DREV 


DREV methyltransferase 


7.3e- 
233 


680.7 


1 


57-318 


2553 


ank 


Ankyrin repeat 


1.8e-07 


29.9 


2 


44-76 


2553 


ank 


Ankyrin repeat 


0.026 


11.4 


3 


77-102 


2554 


pkinase 


Protein kinase domain 


3.5e-64 


223.4 


1 


117-375 


2555 


pkinase 


Protein kinase domain 


3.5e-64 


223.4 


1 


117-375 


2557 


tRNA-synt_le 


tRNA synthetases class I (C) 


0.0002 


14.0 


1 


99-129 
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2557 


tRNA-synt_l 


tRNA synthetases class I (I, L, M and 
V) 


2.4e-07 


23.7 


1 


99-137 


2558 


MHC_II„beta 


Class II histocompatibility antigen, beta 


1.4e-43 


149.3 


1 


41-116 


2562 


fh3 


Fibronectin type III domain 


0.0065 


11.9 


I 


18-105 


2563 


A2M 


Atpha-2-macroglobulin family 


6.3e-23 


75.5 


1 


4-86 
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685 


1-26 


0.982 


0.908 


689 


1-19 


0.975 


0.888 


691 


1-49 


0.944 


0.603 


695 


1-24 


0.993 


0.943 


697 


1-26 


0.919 


0.670 


698 


1-20 


0.988 


0.939 


706 


1-20 


0.989 


0.973 


707 


1-24 


0.973 


0.922 


710 j 


1-33 


0.957 


0.789 


712 


1-57 


0.975 


0.488 


714 


1-42 j 


0.958 


0.680 


715 


1-42 


0.958 


0.687 


725 


1-18 


0.978 


0.956 


728 


1-22 


0.980 


0.917 


732 


1-27 


0.974 


0.932 


733 


1-27 


0.974 


0.932 


734 


1-27 


0.974 


0.932 


738 


1-75 


0.923 


0.462 


742 


1-23 


0.905 


0.707 


744 


1-33 ! 


0.981 


0.884 


747 


1-20 


0.991 


0.954 


748 


1-30 


0.950 


0.785 


753 


1-30 


0.991 


0.936 


754 


1-17 


0.978 


0.905 | 


755 


1-16 


0.967 


0.933 


756 


1-18 


0.970 


0.897 


757 


1-17 


0.948 


0.869 


758 


1-17 


0.948 


0.869 


759 


1-21 


0.916 


0.820 


762 


1-14 


0.972 


0.951 


781 


1-38 


0.917 


0.618 


784 


1-21 


0.984 


0.869 


796 


1-19 


0.982 


0.959 


797 


1-19 


0.982 


0.959 


798 


1-19 


0.982 


0.959 


800 


1-65 


0.857 


0.487 


801 


1-45 


0.903 


0.565 


803 


1-36 


0.985 


0.834 


804 


1-21 


0.993 


0.855 


806 


1-20 


0.937 


0.779 


807 


1-20 


0.937 


0.779 


808 


1-20 


0.937 


0.779 


809 


1-32 


0.972 


0.885 


811 


1-25 


0.991 


0.948 


814 


1-28 


0.948 


0.827 


815 


1-33 


0.947 


0.744 


816 


1-23 


0.986 


0.908 


817 


1-23 


0.986 


0.908 


819 


1-21 


0.959 


0.755 


825 


1-35 


0.974 


0.637 
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826 


1-42 


0.981 


0.909 


834 


1-21 


0.978 


0.751 


835 


1-44 


0.985 


0.831 


836 


1-44 


0.985 


0.814 


838 


1-31 


0.986 


0.935 


844 


1-18 


0.951 


0.879 


845 


1-18 


0.951 


0.879 


848 


1-20 


0.992 


0.794 


852 


1-24 


0.976 


0.901 


853 


1-24 


0.976 


0.901 


855 


1-25 


0.933 


0.751 


858 


1-24 


0.915 


0.567 


867 


1-17 


0.968 


0.863 


868 


1-17 


0.968 


0.863 


869 


1-34 


0.987 


0.781 


870 


1-16 


0.901 


0.686 


872 


1-14 


0.964 


0.931 


877 


1-21 


0.988 


0.958 


878 


1-22 


0.915 


0.833 


879 


1-25 


0.922 


0.765 


880 


1-25 


0.922 


0.765 


882 


1-20 


0.917 


0.819 - 


888 1 


1-24 


0.985 


0.945 


889 


1-17 


0.989 


0.945 


890 


1-23 


0.995 


0.938 


891 


1-24 


0.971 


0.882 


893 


1-16 


0.891 


0.770 


894 


1-20 


0.972 


0.859 


900 


1-22 


0.931 


0.862 


901 


1-24 


0.993 


0.937 


907 


1-22 


0.974 


0.850 


908 


1-23 


0.993 


0.950 


909 


1-15 


0.994 


0.617 


910 


1-23 


0.993 


0.950 


919 


1-15 


0.947 


0.797 


924 


1-19 


0.964 


0.927 


925 


1-19 


0.964 


0.927 


927 


1-26 


0.962 


0.783 


930 


1-43 


0.987 


0.765 


932 


1-31 


0.992 


0.803 


934 


1-23 


0.984 


0.884 


936 


1-48 


0.967 


0.624 


939 


1-30 


0.973 


0.851 


941 


1-18 


0.978 


0.957 


942 


1-21 


0.978 


0.937 


948 


1-21 


0.965 


0.760 


951 


1-29 


0.989 


0.946 


954 


1-31 


0.945 


0.587 


956 


1-22 


0.836 


0.491 


958 


1-28 


0.984 


0.903 
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060 


1-24 


0.987 


0.924 


061 


1-24 


0.987 


0.924 




1-24 


0.987 


0.924 


065 


1-21 


0.993 


0.934 


066 


1-43 


0.974 


0.653 


067 


1-32 


0.953 


0.778 


068 


1-40 


0.972 


0.632 


Q70 


1-24 ! 


0.981 


0.938 


071 


1-94 


0.981 


0.776 


y / j 


1-98 i 


0.923 


0.694 


Q78 


1-37 


0 968 


0.746 




1-37 


0 968 


0.746 


you 




0 984 


0.943 1 


yo 1 


1-1 8 


0 961 

\j*y \j x 


0.869 


089 


1-24 


0.971 


0.865 


081 


1 91 


0 988 


0.937 


084 


1-90 I 


0 938 


0.716 


08^ 


1-20 


0.938 


0.716 


Q86 


1-95 


0.913 


0.560 


Q88 


1-16 


0.969 


0.949 


003 

yyo 


1-39 


0.972 


0.817 


004 


1-21 


0.970 


0.808 


006 
yy\j 


1-22 


0.977 i 


0.837 


1006 


1-35 


0.967 


0.668 


1010 


1-24 


0.980 


0.902 


1013 

1U1J 


1-24 


0.987 


0.903 


1014 


1-24 


0.987 


0.903 


1017 


1-23 


0.932 


0.654 


1010 


1-20 


0.984 


0.868 


1021 


1-25 


0.948 


0.735 


1024 

1 V7i.T 


1-23 


0.968 


0.924 


1027 


1-25 


0.956 


0.848 


1028 


1-16 


0.993 


0.980 


1029 


1-16 


0.993 


0.980 


1031 


1-33 


0.985 


0.813 


1030 


1-46 


0.982 


0.666 


1041 

X \J*T X 


1-41 

1 i X 


0.988 


0.886 


1fl4fi 


1-94 


0 991 

\j»yy x 


0.940 




1 10 


0 001 


0.934 




1-91 


0 091 

\j.yy x 


0.903 


1053 


1-25 


0.971 


0.897 


1054 


1-24 


0.975 


0.932 


1055 


1-18 


0.986 


0.965 


1057 


1-18 


0.978 


0.887 


1058 


1-18 


0.978 


0.887 


1060 


1-26 


0.987 


0.917 


1062 


1-34 


0.991 


0.901 


1066 


1-31 


0.992 


0.741 


1068 


1-22 


0.962 


0.919 


1072 


1-22 


0.986 


0.943 
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TABLE 5 



SEO ID 


Position 


Maximum score 


Average score 


1073 


1-23 


0.974 


0.799 


1075 


1-33 


0.986 


0.886 


1076 


1-23 


0.969 


0.696 


1077 


1-23 


0.969 


0.696 1 


1078 


1-17 


0.978 


0.905 


1079 


1-30 


0.935 


0.717 


1080 


1-17 


0.978 


0.905 1 


1081 


1-17 


0.978 


0.905 J 


1082 


1-17 


0.978 


0.905 


1083 


1-26 


0.936 


0.809 


1084 


1-23 


0.993 


0.907 1 


1085 


1-18 


0.969 


0.643 


1092 


1-19 


0.937 


0.713 j 


1096 


1-39 


0.995 


0.594 


1097 


1-39 


0.995 


0.594 


1100 


1-20 


0.964 


0.902 


1101 


1-23 


0.993 


0.950 


1102 


1-23 


0.993 


0.950 


1105 


1-21 


0.987 


0.963 


1106 


1-19 


0.947 


0.709 


1111 


1-13 


0.911 


0.718 


1117 


1-20 


0.930 


0.706 


1118 


1-16 


0.964 


0.790 


1121 


1-24 


0.968 


0.825 


1123 


1-20 


0.991 


0.881 


1128 


1-22 


0.969 


0.871 


1129 


1-25 


0.985 


0.864 


1130 


1-25 


0.985 


0.864 


1131 


1-20 


0.958 


0.893 


1132 


1-21 


0.942 


0.717 


1134 


1-24 


0.976 


0.925 


1136 


1-14 


0.972 


0.951 


1137 


1-19 


0.960 


0.901 j 


1139 


1-33 


0.995 


0.835 


1140 


1-30 


0.993 


0.853 


1141 


1-30 


0.993 


0.853 


1143 


1-35 


0.974 


0.637 


1144 


1-42 


0.981 


0.909 


1145 


1-21 


0.975 


0.874 


1150 


1-21 


0.914 


0.729 


1152 


1-17 


0.990 


0.973 


1153 


1-17 


0.990 


0.973 


1155 


1-23 


0.965 


0.907 


1161 


1-39 


0.954 


0.705 


1162 


1-45 


0.929 


0.575 


1165 


1-19 


0.939 


0.857 


1167 


1-25 


0.951 


0.619 


1170 


1-37 


0.978 


0.830 


1172 


1-16 


0.957 


0.870 


1173 


1-16 


0.957 


0.870 1 
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TABLE 5 



SEQID 


Position 


Maximum score 


Average score 


1174 j 


1-21 


0.914 


0.729 


1178 


1-25 I 


0.980 


0.925 


1179 


1-17 


0.915 


0.659 


1181 


1-22 


0.950 


0.719 


1186 


1-18 


0.985 


0.928 


1192 I 


1-18 


0.960 


0.803 1 


1196 


1-48 


0.905 


0.599 


1199 


1-20 


0.988 


0.955 


1200 


1-16 


0.907 


0.635 


1205 


1-25 


0.974 


0.781 


1207 


1-28 


0.965 


0.842 


1208 


1-23 


0.965 


0.693 


1210 


1-21 


0.988 


0.911 


1213 


1-31 


0.940 


0.696 


1214 


1-17 


0.983 


0.956 


1218 


1-23 


0.996 


0.969 


1219 


1-15 


0.967 


0.909 


1221 


1-16 


0.978 


0.938 


1222 


1-32 


0.939 


0.646 


1223 


1-23 


0.982 


0.945 


1226 


1-31 


0.991 


0.925 - 


1228 


1-32 


0.953 


0.778 


1231 


1-23 


0.965 


0.907 


1232 


1-23 


0.965 


0.907 


1233 


1-23 


0.965 


0.907 


1235 


1-21 


0.873 


0.596 


1240 


1-20 


0.987 


0.949 


1241 


1-22 


0.994 


0.890 


1244 


1-27 


0.998 


0.952 


1245 


1-27 


0.998 


0.952 


1247 


1-23 


0.980 


0.931 


1253 


1-17 


0.945 


0.731 


1258 


1-20 


0.984 


0.923 


1259 


1-32 


0.956 


0.757 


1261 


1-20 


0.967 


0.781 ] 


1262 


1-18 


0.961 


0.886 


1265 


1-23 


0.991 


0.915 


1266 


1-23 


0.991 


0.915 


1267 


1-19 


0.973 


0.788 


1268 


1-34 


0.988 


0.888 


1269 


1-21 


0.922 


0.610 


1271 


1-23 


0.910 


0.653 


1272 


1-18 


0.997 


0.757 


1275 


1-29 


0.989 


0.943 


1278 


1-34 


0.994 


0.867 


1279 


1-15 


0.983 


0.957 


1280 


1-15 


0.969 


0.641 


1281 


1-36 


0.916 


0.620 


1282 


1-36 


0.916 


0.620 


1283 


1-36 


0.896 


0.584 
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TABLE 5 



SEQID 

*^ ^ 


Position 


Maximum score 


Average score 


1287 


1-18 


0.836 


0.471 


1288 


1-31 


0.952 


0.767 


1290 


1-22 


0.962 


0.904 


1292 


1-33 


0.904 


0.641 


1293 


1-33 


0.904 


0.641 


1295 


1-27 


0.962 


0.882 \ 


1297 


1-30 


0.995 


0.964 


1298 


1-30 


0.995 


0.964 


1300 


1-25 


0.998 


0.961 


1302 


1-16 


0.921 


0.729 1 


1303 


1-24 


0.991 


0.913 


1310 


1-52 


0.987 


0.492 , 


1311 


1-19 


0.903 


0.592 


1314 


1-16 


0.887 


0.735 


1315 


1-27 


0.911 


0.682 


1316 


1-27 


0.911 


0.682 j 


1317 


1-25 


0.987 


0.924 


1319 


1-20 


0.973 


0.759 


1320 


1-20 


0.968 


0.733 


1322 


1-16 


0.969 


0.894 


1323 


1-16 


0.969 


0.894 


1324 


1-28 


0.957 


0.874 i 


1325 


1-17 


0.972 


0.946 


1326 


1-17 


0.972 


0.946 


1327 


1-18 


0.905 


0.593 


1328 


1-16 


0.895 


0.561 


1329 


1-17 


0.978 


0.896 j 


1330 


1-20 


0.988 


0.963 


1333 


1-24 


0.985 


0.965 


1335 


1-22 


0.966 


0.767 


1343 


1-32 


0.954 


0.675 


1344 


1-18 


0.951 


0.879 


1345 


1-30 


0.978 


0.901 


1347 


1-20 


0.961 


0.880 


1348 


1-18 


0.978 


0.940 


1350 


1-23 


0.989 


0.868 


1352 


1-23 


0.993 


0.883 { 


1354 


1-25 


0.924 


0.567 


1358 


1-18 


0.993 


0.909 


1359 


1-15 


0.855 


0.706 


1360 


1-31 


0.985 


0.908 


1361 


1-17 


0.995 


0.950 


1362 


1-17 


0.995 


0.950 


1364 


1-29 


0.962 


0.860 


1366 


1-17 


0.978 


0.905 


1368 


1-26 


0.958 


0.843 
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TABLE 6 



oLy 1JLJ 




1 
1 


17 


9 


17 


J 


17 


A 


17 


_> 


17 


O 


7 


7 


7 


o 
o 




0 


1 < 

L J 


LU 


70n1 1 7-11 77 
zvp 1 1 .Z- l 1 ,zz. 


1 1 

1 I 


4 


1 7 


17 


1 "5 


11 
1 J 


1 A 
14 


14n74 


1 J 


1 R 


10 j 


A 

*T 


1 7 
1 / 


lzij 


1 R 
lo 


7 


1Q 


?2n1?2 


70 


1 7 


L 1 


1 


LL 


70 


Zj 


1 


74 
Z4 




7S 


I J 


76 


1 1n14 


97 


2q2L2 


98 


7n71 9 


90 


1 2 

IZr 


io 


4 


1 1 


4 


17 


1Q 


11 


1 0-<?nppifir 


14 


j 


j j 


7 


16 




17 


11 


1R 

JO 


17 


19 
j? 


10 


41 


Xn22 




Q 
o 


*TJ 


19 


44 


6 


4*\ 
4j 


J 


46 


1 On 1 1 1 i 


47 


1 


48 


1 


49 


1 


50 


1 


51 


X 


52 


22q 13.33 


53 


19 


54 


19 


55 


6 


56 


3 
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TABLE 6 





GENOMIC LOCATION 


5/ 


4 


58 


13 


«G 

jy 


11 


uU .„. . 


1 


61 


19 




13 


63 


12 


64 


6 


65 


1 


66 


4 


67 


7 


68 


20 


69 


1 1 
1 l 


70 


17n13 


71 


1n32 1-33 * 


72 




73 


14 


74 


14 


75 


6nl 1 2-12 3 


76 


11a 

1 _Ly — 


77 


15 


78 


19 


79 


? 


OA 

80 


Q 


O 1 

81 




82 


9 


83 


11 


84 


2pl3 


OJ 


11 


Q£ 
OO 


lp36.2-36.33 


C7 


1 


oo 

88 


17 


on 

89 


20pl3 


OO 

yu 


7a22-a31 1 


01 

91 


4 


oo 

92 


7 


oi 


I7q25 


y4 


llql3.3 


os 
yj 


8a22-a23 


0£ 

yo 


11 


no 

y8 


g 


yy 


7 


1UU 


4 


1 A 1 

101 






6al6 3-22 1 


103 


4 


104 


5 


105 


5 


106 


Xpll.3 


107 


I2pter-pl3.31 


108 


4 


109 


19 


110 


10cen-q26.ll 


111 


19pl3 1 


112 


17 
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TABLE 6 





GENOMIC LOCATION 


1 1 1 


17 


1 1 A 
1 14 


17 


1 ic 
1 ID 


6 


1 1 fs. 


14 


I I / 


14 


1 io 

1 15 


10 


1 1 Q 


3 


12U 


1 


1 1 1 

1/1 


10 


12/ 


14 


1 Ol 

12J 


14 


1 1/1 

124 


4 


1 oc 

123 


15q2l.3 


1 1/C 


15 


12/ 


6 


131 


17 


1 

132 


11a 

1 l n 


i n 

13J 


11a 

1 a h ■ 


13.) ...... 


lp36.12 


13o 


16 


1 J / 


18 


1 ^52 

1 JO 


3 


i 


2 


14U 


llql3 


141 


7q33-q35 


1 A9 
142 


7q33-q35 




4 


1 AA 
144 


6q25.3-27 


1 A<\ 
143 


8 


1 A£ 
140. 


8 


1/7 
14 / 


17 


1 AR 
14& 


17 


1 AQ 


11 


13v 


22 


131 


17 


132 


17 


13.3 : _., 


Xq22 


1 ^A 
134 


5 


133 


14 . 


130 


13ql2.11-12.3 


13 / 


9 


13o 


22ql2 


1 KQ 


22ql2 


low _ 


10 


lol 


10 


163 


15ql5 


164 


15ql5 


165 


17 


166 


4 


167 


4pl6 


168 


4pl6 


169 


4pl6 


170 


11 


171 


5 


172 


14 
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TABLE 6 



SEO ID 


GENOMIC LOCATION J 


173 


15 


174 


15 


175 


8p21.3-qlU 


176 


8p21.3-qll.l . 


177 


3p21-pl2 


178 


19 j 


179 


16 


180 


22ql2 


181 


X 


182 


6 


1 81 


5pl3 j 


1 84 

1 0*T 


5pl3 




I2q 


1 87 


10 


1 88 


3 


189 


2 


190 


22 


191 


7 


192 


19 


193 


14 


194 


llq22 


195 


Hq22 J 


196 


liq22 


197 


15 


198 


4 


199 


4q28-q32 


200 


6p2Ll-22.2. 


201 


12 


202 


2p24 


203 


2d24 


204 


i 


205 


14 


206 


18 


207 


20 


208 


19 


209 


1 


210 


4 


211 


1 


212 


4 


213 


3 


214 


3 


215 


9 


216 


1 


217 


11 


218 


7q 


219 


7q 


220 


17 


221 


5 


222 


15 


223 


10 


224 


8 


225 


8 


226 


8 


227 


9 


228 


8 
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TABLE 6 





GENOMIC LOCATION 


990 
ZZi/ 


18 


97fi 
ZJU 


7nter-D22 


971 
ZJl 


7 


979 
zjz 


17 


977 


17 


ZjH- 


17 


97^ 


10 j 


976 


Ip34.1-lp35. 


977 


4pl6-pl5 


978 


4ol6-pl5 


970 


4pl6-pl5 


9 AO 


4pl6-pl5 


941 
Z*f L 


4pl6-pl5 


949 

Z*tZ 


20 


947 


6pl2.3-21.2 




2 


94^ 

Zhj , 


6d21 2-22 1 


946 


19ql3.4 


947 

Z*t f 


llql3.3 


948 


6 


940 


18 


951 


3 


959 


11 


253 


11 


254 


19 




6 


256 


2 


957 


Xp22 


955? 


Xp22 


259 


20ql2-13.1 


260 


20ql2-13.1 


261 


20ql2-13.1 


262 


20ql2-13.1 


263 


16 


264 


4 


265 


11 


266 


9q34.2-34.3 


267 


20 


96R 


12 


960 
zoy 


7 


970 
z / u 


9 


971 
Z / I 


20 


979 

Z/Z 


14 


977 

Z / J 


9p34. 1-35.1 


974 

Z /*T 


17 


275 


10 


276 


10 


277 


10 


278 


10 


279 


lp36. 11-36.31 


281 


11 


282 


3p2l.3 


283 


14 


284 


1 


285 


1 
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TABLE 6 



CPA 1X\ 

SEQ ID 


GENOMIC LOCATION 


zoo 


1 1 

1 1 


1 QH 
ZO / 


22al2 3-13 2 


Zoo ... ,,. 




loy ^ 


17q21.3 


OOA 

zyu 


2 


OQ1 


1d?2 2-31 I 


ooo 

zyz 


1d22 2-31 I 


OQ7 


19 


zy4 


Xa23 


295 


Xa27 


296 


10a 2S 3-a26 2 


297 


1 
I 


29© 


20a 12-1 3 2 


zyy 






0 


3U1 


0 

z> 


JUZ 


1?a24 1 




20all 21-13 13 


3U4 


g 

o 


3 ID 


I 


3UO 


12a 


30/ 


18nl2-a2l 


3Uo 


14 


7AO 




3 IU 


2 


11 1 
311 


0 n 70 1-n76 3 


7 1 o 

J IZ 


12n13 32 


11 7 


5 


J IH 


13 


71 *\ 


22ql3.1 


J ID 


18 




11 


71 R 


11 


710 

3iy 


7 

•j 


70fi 
oZU 


2 


70 1 
3Z1 


4a27 


709 


11 


707 

3Z3 


1 1 

1 1 


70A 
3Z4 


Xal2 1-13 


70< 

3Z3 


0 


70£ 
3Z0 


22. 


70 Q 
3Zo 


7 

j 


70Q 

3zy 


0 


17 fl 

330 


Q 


331 


< 


772 


22ql3.31-l3.33 


333 


22ql3.31-13.33 


334 


10 


335 


3 


336 


15 


337 


5 


338 


5 


339 


3p 


340 


3 


341 


3 
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TABLE 6 





GENOMIC LOCATION 


14? 


5 


141 
0*+0 


12 


144 


14 J 


14S 


14 


146 


3 




16 


148 


4 


14Q 


6q24.3-25.3 




5 


1^1 


1 


157 


9q31.t-31.3 


151 


2 


ISA 


16 


155 


6 


156 


5ql3 


157 


17 


15Q 


6 


1AH 


19 


161 


5 ^ 


16? 


Ilql4.3-q21 


161 


12q 


'If. A 


22qi3.31-13.33 


165 


9 


166 


2 


167 


2 


16R 


22qll.l-qll.2 


160 


16 


17ft 


17 


171 


11 


17? 


20 


171 


20 


174 


20 


175 


20 


176 


15ql3-ql4 


177 


1 


178 

J / o 


16pl3.3 


170 


20 


jOU 


3 


1R1 


1d32-p31 


1R9 


q42.2-43 


181 


X 


1R4 

00*+ 


15 


IRS 

OO J 


16 


1R6 


8 


187 


3d25-d24 


388 


11 


389 


7 


390 


7 


391 


2 


392 


22 


393 


22 


394 


12 


395 


12 


396 


12 


397 


12 
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TABLE 6 _ 


SEQ ID 




398 


IZ 


399 


1 A 

14 


400 


Z 


401 


J 


402 


i 
1 


403 


1 0/-t1 1 A. 

iyq id.** 


404 




40S 


1 /q Lz-qzi 


406 


I /qlZ-qzi 


407 


1 /qlz-qzl 


408 


1U 


409 


1 A 

10 


410 


lzplj.i 


411 


1 "3 O O 1 1 

AqlJ.z-zl.l 


412 


lzpiJ 


413 


lzpiJ 


414 




415 


7qzz 


416 




417 


0 
0 


418 


0 
0 


419 


1 0 


420 


1 


421 


Iqzj.l-J 1.0 


422 


0 
0 


423 


1 1 


424 


1 1 


425 


1 < 
10 


426 




427 


1 1 

1 1 


428 


Z 


429 




430 


g 
O 


•431 


A 

4 


432 


4 


433 


1 0 


434 


1 Q 


435 




436 


1 /I 


437 




438 


ZZCjli 


439 


ZUpl 1.ZZ-IZ.Z 


440 


Z 


441 


z 


442 


4 


A a ^ 

443 


1 


444 


1 


445 


8 


446 


8 


447 


7pll.2-qll.2 


448 


8 


449 


19 


450 


6 


451 


7q22-q31.1 


452 


19 
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TABLE 6 



SEO ID 


GENOMIC LOCATION 


453 


4q22-q24 


454 


17 


45S 


8 


456 


1 


457 


1 


458 


19 


459 


7q33-q35 


460 


7q33-q35 


461 


14 


462 


4 


461 


2 J 


464 


17 


465 


17 


466 


19 


467 


12 


468 


llq22 


460 


11 1 


470 


10cen-q26.1i 


471 


22ql2-l3. 


472 


20ql3.33 


473 


1 


474 


1 


475 


17 


476 


18 


477 


3 


478 


12 


479 


16 


480 


5 


481 


2 


482 


2q33-q34 


483 


19 


484 


4 


485 


12pl2.3-l3.2 


486 


3 


487 


3 


488 


19 


489 


19 


490 


19 


491 


10pl2 


492 


17 


493 


10 


494 


12 


495 


18 


496 


13 


497 


10 


498 


16 


499 


22ql2.2 


500 


X 


501 


3 


507 


15 _ 


503 


3 


504 


19 


505 


19 


506 


19 


507 


5 
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TABLE 6 



SEO ID 


GENOMIC LOCATION 


508 


7q2l.2-q31.1 


509 


2 


510 


2pl3 


511 


Ip2l-pl3 


512 


7 


513 


U 


514 I 


16 


515 


9 


516 


18 


517 


10 


518 

J io 


5 


51Q 


10 


520 


21q22.3 


591 


7 


59? 

jii- 


3p21.l-pl4.2 LJ 


523 


lq2l 


525 


lq25.l-31.1 


526 


7q35 j 


527 


9 


528 


1 


529 


5q32 


530 


19 


531 


2 


532 


18 


533 


1 


534 


22 


535 


13 


536 


X 


537 


4q21-q25 j 


538 j 


Iq23-q25.1 


539 


18 


' 540 


22ql2 


541 


3p24 


542 


19pl3.1 


543 


2 


544 


14 


545 


6p2U-21.3 


546 


12 


547 


22ql2-13. 


548 


22ql2-13. 


549 


22ql2-13. 


550 


17 


551 


14 


552 


15 


55^ 


15 


554 


1 


555 


lq23.2-24.3 


556 


15q21.1 


557 


3 


558 


12q24.1 


559 


17 


560 


19 


561 


19 


562 


9 1 


563 


5 
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SEQ ID 


OF.NOMTC LOCATION 


564 


7 


565 


2 


1 566 


1 1 


567 


< 


568 




569 


i 
i 


570 


*\ 


571 




572 


1 1 all ^-n?3 1 


573 


1 In?? Vn?'* 1 I 


574 


17 


S7S 


4 


576 


1 1n 1 

i^q 


S77 


1^n14?-14^ 


578 


J 


S7Q 


1ft 1 


580 


1 c 
1 j 


581 


in 


582 


10 

lv 


583 


A 


584 


1 1 
l l 


585 


Y 
yv 


586 


17 


587 


7 


588 




589 


0 


590 


Q 


591 


?ft 


592 


17 


593 


17 

*■ ' .... 


594 




595 


1 


596 


16 


597 


7 


598 


7 


599 


7 1 


S~ [\t\ 

600 


17 


601 


? 1 


602 


Q 


603 


7 


604 


1Q 


605 


1 6n?? 


606 


1 7 

I ' .... 


607 


?? 


608 


1 In 

. ilC l — 


0\Jy 


llq 


610 


9 


611 


20 


612 


2 


613 


6 


614 


6 


615 


15 


616 


22ql3.32 


617 


11 


618 


11 
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TABLE 6 



SEQ IU 


GENOMIC LOCATION 


019 


15 


oZU 


15q21.3 


£0 1 

oZl 


15q21.3 


£T) 
OZZ 


14 


£71 
OZj 


14a J 


£74 

0Z4 


3 


£7* 
OZj 


3 


£T£ 
OZO 


2p22-2p21 


OZ / 


7 


OZo 


1 4 


ozy 


06 


OjU 


X 


£71 


8 


£17 
OjZ 


8 


Ojj 


17q25.2-q25.3 


£74 
OjH 


16 


£i<; 
Ojj 


13 - J 


£1£ 
OjO 


13 


Oj / 


1 


£18 


19 


£1Q 

ojy 


19 




9 


£41 


19ql3.2 


£47 
04Z 


19ql3.2 


£41 
0*0 


lp36.1 


£44 


19 


£4<% 
04D 


1 


• £4£ 
OHO 


Xq23 


O^ti* 


2q35 

" 


£sn 


17 


Oj 1 


7 


£^7 


lq21 


£S1 


4 


6*i4 

Oj*T _, 


14 


£5*i 


2 


OJO . 


Ip36.2-p35 


O J / 


2 


Ojo 


16 




3 


oou 


6 


££1 

OO 1 


10 


££7 
OOZ 


10 


££1 
OOj 


16 


££4 

oo*t 


3 


££< 
OOj 


3 


666 


17 


667 


12 


668 


16pll.2 


669 


11 


670 


17 


671 


2 


672 


12pl3.3 


673 


12pl3.3 


674 


5 


675 


2 
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676 


6p2 1.2-2 1.3 1 1 


677 


19 




19 


679 


19 


680 


3 


681 


2ql4 


682 


12 


683 


7 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 


1967 


A 


75 


509 


DPKAQLPEPLRVLWTAHLVAMAPGSRTSLLLAFALLC 
LPWLQEAGAVQTVPLSRLFDHAMLQAHRAHQLAIDTY 
QEFEETYIPKDQKYSFLHDSQTSFCFSDSIPTPSNME 
ETQQKSNLE LLR I S LLL I E S WLE P VRI IiMS I VPN 


1968 


A 


75 


509 


DPKAQLPEPLRVLWTAHLVAMAPGSRTSLLLAFALLC 
L P WLQE AGAVQTVPLS RLFDHAMLQ AHRAHQLAI DT Y 
QEFEETYI PKDQKYSFLHDSQTSFCFSDSI PTPSNME 
ETQQKSNLELLRISLLLIESWLEPVRILMSIVPN 


1969 


A 


75 


509 


DPKAQLPEPLRVLWTAHLVAMAPGSRTSLLLAFALLC 
LPWLQEAGAVQTVPLSRLFDHAMLQAHRAHQLAI DTY 
QEFEETYI PKDQKYSFLHDSQTSFCFSDSI PTPSNME 
ETQQKSNLELLRI SLLLI ES WLE PVRILMS I VPN 


1970 


A 


75 


509 


DPKAQLPE PLRVLWTAHLVAMAPGSRTS LLLAFALLC 
LPWLQEAGAVQTVPLSRLFDHAMLQAHRAHQLAI DTY 
QEFEETYI PKDQKYS FLHDSQTS FCFSDS I PTPSNME 
ETQQKSNLELLRI SLLLIESWLEPVRILMS I VPN 


1971 


A 


1764 


403 


KAAKKALCWLE PPQCAGLEGLGWVWSCS VSTGPRMQA 
LVLLLCIGALLGHSSCQNPASPPEEGSPDPDSTGALV 
EEEDPFFKVPVNKLAAAVSNFGYDLYRVRSSMSPTTN 
VLLSPLSVATALSALSLGAEQRTESIIHRALYYDLIS 
SPDIHGTYKELLDTVTAPQKNLKSASRIVFEKKLRIK 
SSFVAPLEKSYGTRPRVLTGNPRLDLQEINNWVQAQM 
KGKLARSTKE I PDE I S I LLLG\ VAHFKGQ\ WETKFDS 
RKTSLEDFYLDEERTVRVPMMSDPKAVLRYGLDSDLS 
CKIAQLPLTGSMSIIFFLPLKVTQNLTLIEESLTSEF 
IHDIDRELKTVQAVLTVPKLKLSYEGEVTKSLQEMKL 
QSLFDSPDFSKITGKPIKLTQVEHRAGFEWNEDGAGT 
TPS PGLQ PAHLTF PLDYHLNQ P F I FVLRDTDTGALLF 
IGKILDPRGP 


1972 


A 


3 


147 


QPLNHYFICSSHNTYLVGDQLCGQSSVEGYIRCSGGR 
EGVQLMRGTM 


1973 


A 


2 


2117 


FWAASGGCWFVLGERRAGSLLSASYGTFAMPGMVLF 

GRRWAI ASDDLVF PGFFELWRVLWWIGILTLYLMHR 

GKLDCAGGALLSSYLIVLMILLAWICTVSAIMCVSM 

RGTICNPGPRKSMSKLLYIRLALFFPEMVWASLGAAW 

VADGVQCDRTWNGI I ATVWSWI 1 1 AATWS 1 1 1 VF 

DPLGGKMAPYSSAGPSHLDSHDSSQLLNGLKTAATSV 

WETRI KLLCCC IGKDDHTRVAFS STAELFSTYFSDTD 

L VPS D I AAGLALLHQQQDN I RNNQE PAQ WCHAPGS S 

QEADLDAELKNCHHYMQFAAAAYGWPLYIYRNPLTGL 

CRIGGDCCRS KNPQTMT/MVGGDQLQL/ CTSAPILHT 

HRAAVQGLHPRQL PWTRFTELPFLVALDHRKE S VWA 

VRGTMSLQDVLTDLSAESEVLDVECEVQDRLAHKGIS 

QAARYVYQRLINDGILSQAFSIAPEYRLVIVGHSLGG 

GAAALLATMVRAAYPQVRCYAFSPPRGLWSKALQEYS 

QSFIVSLVLGKDVI PRLSVTNLEDLKRRILRWAHCN 

KPKYKILLHGLWYELFGGNPNNLPTELDGGDQEVLTQ 

PLLGEQSLLTRWS PAYSFS SDS PLDSS PKYPPLYPPG 

RIIHLQEEGASGRFGCCSAAHYSAKWSHEAEFSKILI 

GPKMLTDHMPDILMRALDSWSDRAACVSCPAQGVSS 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
aciu i cmuuc 

sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 

ar Sri 

residue of 

peptide 

sequence 


Amino acid sequence (X==Unknown, *-Stop coaon, 
/=possible nucleotide deletion,=possib!e nucleotide 
insertion) 


1974 


A 




DID 


VDVA 

E VHQGTE VRDS EVRRR PQARG PLMPAERAGRQRWLVP 
ALQPRRGGLRR* RGAVRQHGAHPHGLLLQDQKI PALP 
GRKQAGSLHAPGTEGEPDHGGDPVLDAGIQHHRQQRH 
PTADHLNPGEHRRGEAHVRAAV* PAAGAEGAAKERRA 
HQANTALQVHRR* LGS F AELRLLRKPGRTS VWPS PM 


1975 


A 


337 


440 


PliALCLAPAASLHELCAAKVS EVLHNRVHRTEEV 


1976 


A 


1454 


1101 


AF YNANS CLNVFC FC F C F WRQ SRC I SQAGVQ WC DLS S 
LQPPLPRFKRFSYLSLPSSWDYRHAPPCPANFCIXLV 
ETGLCHIGHACIiELLTSGDPPALASQSAGITGMSHST 

QPCIAVS 


1977 


A 


2 


1454 | 


DDFVGVLSATAQVCTMAARLVSRCGAVRAAPHSGPL/ 
AVLAQ WRR \ S TDTVYD VWS GGGL VGAAMAC ALG YD 
IHFHDKKILLLEAGPKKVLEKLSETYSNRVSSISPGS 
ATLLSSFGAWDHICNMRYRAFRRMQWroACSEALIMF 
DKDNLDDMGYI L\ ENDV\ IMHAFTKQLEAVSDRVTVL 
YRSKAIRYTWPCPFPMADSS PWVHITLGDGSTFQTKL 
LIGADGHNSGVRQAVGI QNVS WNYDQSAWATLHLSE 
ATENNVAWQRFLPSGPIALLPLSDTLSSLVWSTSHEH 
AAELVSMDEEKFVDAVNSAFWSDADHTDFIDTAGAML 
QYAVSLLKPTKVS ARQL PPS VARVDAKSRVLF PLGLG 
HAAEYVRPRVALIGDAAHRVHPLAGQGVNMGFGDI SS 
LAHHLSTAAFNGKDLGSMSHLTGYETERQRHNTALLA 
ATDLLKRLYSTS AS PLVLLRTWGLQATNAVS PLKEQI 
MAFASK 


1978 


A 


3692 


3395 


LKDSLLRFFFFEMESCSVTRLECSGVISAHRNLRLPG 
SSNS PTSASQVAGTTGMHPHTQLI FVFSAETGFPHAG 
QDGLDLL / NLVI S P P W P PKYLGLQ A 


1979 




cc 

DO 


265 


SALLGLPSSWDYRRPPPRPANFLYF* *RRGFTVLARM 
VS I C * PRDP PAS AS RS AGI SGVSRGRPPS 


1980 


A 


751 


176 


LPGADYGGGHLSLRLFHLLLTSAAWVPDESQVTLNSA 
ICVLSTVLIMEFPDLGKHCSEKTCKQLDFLPVKCDAC 
KQDFCKDHFPYAAHKCPFAFQKDVHVPVCPLCNTPIP 
VKKGQIPDVWGDHIDRDCDSHPGKKKEKIFTYRCSK 
EGCKKKEMLQMVCAQCHGNFCIQHRHPLDHSCRHGSR 

PTIKAG 


1981 


A 


250 


118 


DSLTRLPALCSLQLGRKVETITIIYDCEGLGIjKHLWK 
PAVEAYG 


1982 


A 




1 1 R7 


SIQEKCFDSSCGRNSLLSFSLSYKESHKTFIFYCWVY 
RLCI WI \TAI WQYESLKSRVQS YFDGI KADWLDS IRP 
QKEGDFRKEINKWWNNLSDGQRTVTGI IAANVLVFCL 
WRVPSLQRTMIRYFTSNPASKVLCSPMLLSTFSHFSL 
FHMAANMYVLWS FS S S I VNI LGQEQ FMAVYLSAGVI S 
NFVSYLGKVATGRYGPSLGASGAIMTVLAAVCTKI PE 
GRLAI IFLPMFTFTAGNALKAI IAMDTAGMILGWKFF 
DHAAHLGGALFGIWYVTYGHELIWKNREPLVKIWHEI 

RTNGPKKGGGSK 


1983 


A 


289 


392 


RAFAEAMRGYHGDRGSHPRPARFADQQHMDVGPA 


1984 


A 


98 


1474 


MAWASRLGLLLALLLPWGASTPGTVVRLNKAALSYV 
SEIGKAPLQRALQVTVPHFIjDWSGEALQPTRIRILNV 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion^possible nucleotide 
insertion) 










HVPRLHLKFIAGFGVRLLAAANFTFKVFRAPEPLELT 
LPVELLADTRVTQSSIRTPWSISACSLFSGHANEFD 
GSNSTSHALLVLVQKHI KAVLSNKLCLS I SNLVQGVN 
VHLGTLIGLNPVGPESQIRYSMVS VPTVTSDYI SLEV 
NAVLFLLGKPI ILPTDATPFVLPRHVGTEGSMATVGL 
SQQLFDSALLLLQKAGALNLDITGQIiRSDDNLLNTSA 
LGRLI PEVARQFPEPMP WLKVRLGATPVAMLHTNNA 
TLRLQPFVEVLATASNSAFQSLFSLDVWNLRLQLSV 
S KVKLQGTTS VLGDVQLTVAS SNVGF I DTDQVRTLMG 
TVFEKPLLDHLNALLAMGIALPGWNLHYVAPEIFVY 
EGYWI S SGLFYQS * 


1985 


A 


541 


176 


GPHTSNRPRXRHCTXGPSTXXTXAGSGYS PAHGRAWG 
APCXSW*RSPGPRGGRESGTCRPAAAPAPAPAGGCRA 
GTGAWPPGSATSPRC* S PAAPRGAGPQPGSGGSHGGT 
ARMCACKLAAS 


i one 


2S 
t\ 


C, J J \J 


1943 


AGRRLTQAGTLLGTAliAFGTRLLVSSDMKSWSTVLAV 
MGKAFS E AAFTTAYLFTS EL YPTVLRQTGMGLTALVG 
RLGGSIiAPIiAALLDGVWLSLPKLTYGGIALLAAGTAL 
LLPETRQAQLPETIQDVERKSAPTSLQEEEMPMKQVQ 
N 


1987 


A 


1 


555 


KKVGNYYTTPI YRFRMKCHLCVN YIEMQTDPANCDYV 
IVSGAQRKEERWDMADNEQVLTTGERHPLTCIiGAL/D 
PESALGPPKPSRALIVAEHEKKQKLETDAMFRLEHGE 
ADRSTLKKALPTLSHIQEAQSAWKDDFALNSMLRRRF 
RVRGAPARGQRGCMVDQGPGPALPPPHPSFEQATCTF 


1988 


A 


2867 


847 


GLPGIPGLPGFPGVAGPPGITGFPGFIGSRGDKGAPG 
RAGLYGE IGATGDFGDIGDT I NLPGRPGLKGERGTTG 
I PGLKGFFGEKGTEGD IGFPGI TGVTGVQGP PGLKGQ 
TGF PGLTGP PGSQGELGR IGLPGGKGDDGWPGAPGLP 
GFPGLRGIRGLHGLPGTKGFPGSPGSDIHGDPGFPGP 
PGERGDPGEANTLPGPVGVPGQKGDQGAPGERGPPGS 
PGLQGFPGI T PPSN I SGAPGDKGAPGI FGLKGYRGPP 
GPPGSAALPGSKGDTGNPGAPGTPGTKGWAGDSGPQG 
RPGVFGLPGEKGPRGEQGFMGNTGPTGAVGDRGPKGP 
KGDPGFPGAPGTVGAPGI AG I PQKI AVQPGTVGPQGR 
RGPPGAPGEMGPQGPPGEPGFRGAPGKAGPQGRGGVS 
AVPGFRGDEGPIGHQGPIGQEGAPGRPGSPGLPGMPG 
RS VS I GYLL VKHSQTDQE PMC P VGMNKLWSG YS LL YF 
EGQEKAHNQDLGLAGS CLARF STMPFLYCNPGDVC YY 
ASRNDKS YWLSTTAPLPMMPVAEDE I KPYI SRCS VCE 
APAIAI AVHSQDVSI PHCPAGWRSLWIGYSFLMHTAA 
GDEGGGQSLVSPGSCLEDFRATPFIECNGGRGTCHYY 
ANKYS FWLTTI PEQSFQGS PS ADTLKAGLIRTHI SRC 
QVCMKNL 


1989 


A 


1 


777 


LIYNEDMICWIESRESSNQLKCIQITKAGGLTDEWTI \ 
NI LQS FHNVQQMAI DWLTRNL YFVDHVGDRI FVCNSN 
GSVCVTLIDLELHNPKAIAVDPIAGKLFFTDYGNVAK 
VERCDMDGMNRTRI IDSKTEQPAALALDLYNKLVYWV 
DLYLD YVGWDYQGKNRHAVI QGRQVRHL YGI TVFED 
YLYATNSDSYNIVRISRFNGTDIHSLIKIENAWGIRI 
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TABLE 7 



SFO 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 
peptide 

r r 

sequence 


Amino acid sequence (X=Unknown, *-Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










YQKRTQPTVRSHACEVDPYGMPGGCSHICLLSSSYTK 


1990 


A 


1* 


777 


LIYNEDMICWIESRESSNQLKCIQITKAGGLTDEWTI 
NI LQSFHNVQQMAI DWLTRNL YFVDHVGDRI FVCNSN 
GSVCVTLIDLELHNPKAIAVDPIAGKLFFTDYGNVAK 
VERCDMDGMNRTRI I DS KTEQ PAALALDLVNKLVYWV 
DIj YLD YVGWD YQGKNRHAVI QGRQ VRHL YGI TVFED 
YLYATNSDSYNIVRISRFNGTDIHSLIKIENAWGIRI 
YQKRTQPTVRSHACEVDPYGMPGGCSHICLLSSSYTK 


1 QQT 


A 


1620 


1214 


LPFLSFFLSFFLFFLRWSFALIAQAGVQWCNFGSPQP 
PPPGFKRFSCLSLLSSWDYRHTPPCLANSVFLVDTGF 
LHVGQAGLELPTSGDPPTSASQSAGITSVSHCAQPVT 
AI SKEEREQAEGPDSQGTGSSAGQ 


1992 


A 


1 


660 


GFHPNTTHYRARAAARAGAGS FVGEVS AVDKDFGPNG 
EVRYSFEMVQPDFELHAISGEITNTHQFDRESLMRRR 
GTAVFSFTVI ATDQGI PQPLKDQATVHVYMKDINDNA 
PKFLKDFYQATI SESAANLTQVLRVSASDVDEGNNGL 
IHYS 1 1 KGNEERQ FAI DS T SGQVTLI GKLDYEAT PAY 
SLVIQAVDSGTI PLNSTCTLNIDILDENDNTPFFP 


1993 


A 


1 


660 


GFHPNTTHYRARAAARAGAGS FVGEVS AVDKDFGPNG 
E VRYS FEMVQ PDFELHAI SGE I TNTHQFDRE S LMRRR 
GTAVFSFTVI ATDQGI PQPLKDQATVHVYMKDINDNA 
PKFLKDF YQAT I SES AANLTQ VLRVSASDVDEGNNGL 
IHYS I I KGNEERQ FAI DSTSGQVTLI GKLDYEAT PAY 
SLVIQAVDSGTI PLNSTCTLNIDILDENDNTPFFP 


1994 


A 


2 


271 


GS VALHVEKLPNE PNRLLI LHGFLDENVHFFHTNFLV 
SQLIRAGKPYQLQVALPPVSPQIYPNERHSIRCPESG 
EHYEVTLLHFLQEYL 


1995 


A 


289 


418 


LWTLYRHKQQVQHNHSNRLSCRPSQEDRATHTIMVLD 
KENTLS 


1996 


A 


3 


673 


RNFRVDD F VAELKLKQVRWT P AAP * S KETTQGLRRLH 
VNGRCEPKGLDPEMGRRSSDTEEESRSKRKKKHRRRS 
SSSSSSDSRTYSRKKGGRKSRSKSRSWSRDLQPRSHS 
YDRRRRHRSSSSSSYGSRRKRSRSRSRGRGKSYRVQR 
SRSKSRTRRSRSRPRLRSHSRSSERSSHRRTRSRSRD 
RERRKGRDKEKREKEKDKGKDKELHNIKRGESGNIKA 
GLE/HS ATS * TGQSQTTAGS * SCCKS * * S I ESQRKK* 
GRS KE / QERRKTKPPW * NK* KE * KFGGRRRRPDLKKR 
LRDCGACTSTGGVS PKVWTQKWDVGHQILKKKAEARE 
KRNTVDGPPRAVLQIVEHTAERKEEGNQDQSQDLGPE 
I FSLVHI LM I EDAGI DQAVALLMAPEGNEVEWQGVE 
GNPIEFRGLGQKAEQEGPGQDLVSVLIWAVKGPVTE 
ERWGLGI ENDVRAE I KRKEKRRRI KGRTRNYITSNV 
GNLETSKLD 


1997 


A 


279 


762 


VGNFQRQLAEAKEDNCKVTIMLENVLASHSKMQGALE 
KVQIELGRRDSEIAGLKKERDLNQQRVQKLEAEVDQW 
QARMLVMEDQHNSEI ESLQKALGVAREDNRKLAMSLE 
QALQTNNHLQTKLDHIQEQLESKELERQNLETFKDRM 
TEESKVEAELHAE 


1998 


A 


3 


1434 


PPNMDNSMGTEEITVLKGSSTSMACITDGTPAPSMAW 
LRDGQPLGLDAHLTVSTHGMVLQLLKAETEDSGKYTC 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possibIe nucleotide deletion,=possible nucleotide 
insertion) 










IASNEAGEVSKHFILKVLVPPSFQKLWEIGNMLDTGR 
NGEAKD VI I NN P I SL YCETNAAPP PTLTWYKDGHPLT 
SSDKVLILPGGRVLQI PRAKVEDAGRYTCVAVNEAGE 
DSLQYDVRVLVPPITKGANSDLPEEVTVLVNKSALIE 
CLS SGS PAPRNS WQKDGQPLLEDDHHKFLSNGRI LQI 
LNTQI TD I GRYVCVAENTAGS AKKYFNLNVHVP PS VI 
GPKSENLTVWNNFI SLTCEVSGFPPPDLSWLKNEQP 
I PLNTNTLI APGGRTLQ 1 1 RAKVS DGGE YTC I AINQ A 
GESKKKFSLTVYVPPSIKDHDSESLSVVNVREGTSVS 
LECESNAVPPPVITWYKNGRMITESTHVEILADGQML 
HI KKAEVSDTGQYVCRAINVAGRDDKNFHLNVY 


1999 


A 


2 


1333 


RSGEGFHVNSS *TWVSRS * EMDETPGSEVPGDKAAEE 
QGDDQDS E KS KPAGSDGERRGVKRQRDEKDEHGRAY Y 
EFREEAYHS RS KS PLPPEEEAKDEEEDQTLVNLDTYT 
SDLHFQVSKDRYGGQPLFSEKFPTLWSGARSTYGVTK 
GKVCFEAKVTQNLPMKEGCTEVSLLRVGWSVDFSRPQ 
LGEDEFSYGFDGRGLKAENGQFEEFGQTFGENDVIGC 
FANFETEEVELSFSKNGEDLGVAFWISKDSIiADRALL 
PHVLCKNCWELNFGQKEEPFFPPPEEFVFIHAVPVE 
ERVRTAVPPKTIEECEVILMVGLPGSGKTQWALKYAK 
ENPEKRYNVLGAETVLNQMRMKGLEEPEMDPKSRDLL 
VQQASQCLSKLVQIASRTKRNFILDQCNVYNSGQRRK 
LLLFKTFSRKWVWPNEDDWKKRLELRKEVEGRVFP 




A 
ri 


1 

X 


1060 


1 1 FLFF * PYLQS VI FLFVI RGLEMKYGNE IMNKDPVF 
RI S PRSRETHPNPEEPEEEDEDVQAERVQAANALTAP 
NLEEEPVITASCLHKEYYETKKSCFSTRKKKIAIRNV 
S FCVKKGEVLGLLGHNGAGKS TS I KMI TGCTVPTAGV 
WLQGNRASVRQQRDNSLK/ FLGYCPQENSLWPKLTM 
KEHLEL YAAVKGLGKDAALS IS* LVEALKLQEQLKAP 
VKTLSEGI KRKLCFVLS I LGNPS WLIiDELFTGMD PE 
GQQQMWQILQATIKNQERGALLTTHYMSEAKSLCDRV 
AIMVSGTLRC I GS I QQL / KKFGKDYLLE I KMKE PTQV 
EALHTE I LKLF PQ AAWQERYS S L 


2001 


A 


1 


2543 


TISSSPKWRLSGWRAPCCWGFEWAGGPGDPFPAAEA 
IiEDESGTLLRSGGGAGEQWQQGLRWRPRSGMCESYSR 
SLLRVS VAQI CQALGWDS VQLS ACHLLTDVLQRYLQQ 
LGRGCHRYS ELYGRTDP I LDDVGEAFQLMGVS LHELE 
DYI HN I E PVTF PHQ I PSFPVSKNNVLQFPQPGSKDAE 
ERKEYIPDYLPPIVSSQEEEEEEQVPTDGGTSAEAMQ 
VPIiEEDDELEEEEIINDENFLGKRPLDSPEAEELPAM 
KRPRLLSTKGDTLDWLIjEARE pls S INTQKI PPMLS 
PVHVQDSTDLAPPSPEPPMLAPVAKSQMPTAKPLETK 
SFTPKTKTKTSS PGQKTKSPKTAQS PAMVGS PI RS PK 
TVSKEKKS PGRS KSPKSPKSPKVTTHI PQTPVRPETP 
NRTPS ATLS E KI S KETI Q VKQ I QT PPDAGKLNS ENQ P 
KKAWADKTIEASIDAVIARACAEREPDPFEFSSGSE 
S EGD I FTS PKRI SGPECTTPKASTSANS FTKSGSTPL 
PLSGGTSSSDNSWTMDASIDEWRKAKLGTPSNMPPN 
FPYISSPSVSPPTPEPLHKVYEEKTKLPSSVEVKKKL 
KKELKTKMKICKEKQRDREREKDKNKDKSKEICDKVKEK 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/-possible nucleotide deletion,=possible nucleotide 
insertion) 










EKDKETGRETKYPWKEFLKEEEADPYKFKIKEFEDVD 
PKVKLKIXjLWKEKEKHKDKKKDREKGKKDKDKREKE 
KVKDKGREDKMKAPAPPLVLPPKELALPLFSPATASR 
VPAMLPSLLPVLPEKLFEEKEKPKEKEKKKDKKEKKK 
KKEKEKEKKEKEREKEKREREKREKEKEKHKHEKIKV 
E PVALAPS P VI PRIiTLRVGAGPDKI RRRRAGAH 


2002 


A 


2 


1736 


QNENS VDKWGKPLVI DKLKEMAKVEGLWNLFLPAVSG 
LSHVDYALIAEETGKCFFAPDVFNCQAPDTGNMEVLH 
L YGS EEQKKQWLE PLLQGNI TS C FCMTE PDVAS SDAT 
NI ECS I QRDEDS YVINGKKWWS SGAGNPKCKI AI VLG 
RTQNTSLSR*LNNSD*ETCVGMSQSSSYLGNLLKIHC 
LDSQI IM* DMRVNVI YLYFTS IF* QVFLENI IGS I AE 
HSSLWNFQY*KVLLNYQSCLD*IIRQIFSDLCNEVIR 
CLDQRQ *S*NV*LYI* VPS YHC * AVRS FNQTTHIiFSN 
HCFCSRSQPASDYVGVRLLHSSHSSHHCLHDYMKTSK 
RQLGFCLLSVLFFFIiANFF*YNFSFD*\HKQHSMILV 
PMNT PGVKI I RPLS VFGYTDNFHGGHFE I HFNQVRVP 
ATNLI LGEGRGFEI SQGRLGPGRIHHCMRTVGLAERA 
LQIMCERATQRIAFKKKLYAHEWAHWIAESRIAIEK 
I RLLTLKAAHSMDTLGS AGAKKEI AMI KVAAPRAVSK 
I VDWAI QVCGGAGVSQD YPIiANMYAI TRVLRLADGPD 
EVHLS AI ATMELRDQAKRLTAKI 


2003 


A 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSWVAGGF 
GNAAS ARHHGLLAS ARQ PGVCHYGTKLACC YGWRRNS 
KGVCEATCEPGCKFGECVGPNKCRCFPGYTGKTCSQD 
VNECGMKPRPCQHRCVNTHGSYKCFCLSGHMLMPDAT 
CVNSRTCAMINCQYSCEDTEEGPQCLCPSSGLRLAPN 
GRDCLDIDECASGKVICPYNRRCVNTFGSYYCKCHIG 
FEIiQYISGRYDCIDINECTMDSHTCSHHANCFNTQGS 
FKCKCKQGYKGNGLRCSAIPENSVKEVLRAPGTIKDR 
IKKLLAHKNSMKKKAKIKNVTPEPTRTPTPKVNLQPF 
NYEEIVSRGGNSHGG\KKGNEEKMKEGLEDEKREEKA 
LKD*HRRERPFRG\DVFFPKVNEAGEFGLIIj\VQRKA 
LTSKLEHKADLNISVDCSFNHG\ICDW\KQDR\EDDF 
DW\NPADR\DNAI \GFY\MAVPGLWQGHK\KDIGRLK 
LLLPDLQPQSNFCLLFDYRLAGDKVGKLRVFVKNSNN 
ALAWEKTTSEDEKWKTGKIQIiYQGTDATKS 1 1 FEAER 
GKGKTGEIAVDGVLLVSGLCPDSLLSVDD 


2004 


A 


2 


469 


KGTKNGQFNYPWDVAVNSEGKILVSDTRNHRIQLFGP 
DGVFLNKYGFEGALWKHFDS PRGVAFNHEGHLWTDF 
NNHRLLVIHPDCQSARFLGSEGTGNGQFLRPQGVAVD 
QEGRIIVADSRNHRVQMFESNGSFLCKFGAQGSGFGQ 
MDRPSGIA 


2005 


A 


4135 


639 


QCGPEAASAGSCSAETPSPPPRAPGRGPIMFSRKKRE 
LMKTPSISKKNRAGSPSPQPSGELPRKDGADAVFPGP 
SLEPPAGSSGVKATGTLKRPTSLSRHASAAGFPLSGA 
ASWTLGRSHRS PLTAAS PGELPTEGAGPDWEDI SHL 
LADVARFAEGLEKLKECVLHDDLLEARRPRAHECLGE 
ALRVMHQI I SKYPLLNTVETLTAAGTLI AKVKAFHYE 
SNNDLEKQEFEICALETIAVAFSSTVSEFLMGEVDSST 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/-possible nucleotide de!etion,=possible nucleotide 
insertion) 










LLAVPPGDSSQSMESLYGPGSEGTPPSLEDCDAGCLP 
AEEVDVLLQRCEGGVDAALLYAKNMAKYMKDLI s yle 
KRTTLEME P AKGLQKI AHNCRQS VMQE PHM PLLS I YS 
LALEQDLEFGHSIWQAVGTLQTQTFMQPiiTLRRLEHE 
KRRKE I KEAWHRAQRKLQEAESNLRKAKQGYVQRCED 
HDKARFLVAKAEEEQAGSAPGAGSTATKTLDKRRRLE 
EEAKNKAEEAMATYRTCVADAKTQKQELEDTKVTALR 
QIQEVIRQSDQTIKSATISYYQMMHMQTAPLPVHFQM 
LCESSKLYDPGQQYASHVRQLQRDQEPDVHYDFEPHV 
SANAWS PVMRARKS S FN VSDVARPEAAGS PPEEGGCT 
EGTPAKDHRAGRGHQVHKSWPLS I SDSDSGLDPGPGA 
GDFKKFERTSSSGTMSSTEELVDPDGGAGASAFEQAD 
LNGMTPELPVAVPSGPFRHEGLSKAARTHRLR\KLRT 
PAKCRECNSYVYFQGAECEECCLACHKKCIiETLAIQC 
GHKKLQGRLQLFGQDFSHAARSAPDGVPFIVKKCVCE 
T PR R ALRTKGI YRVNGVKTRVEKLCQAFENGKELVEL 
SQAS PHDI SNVLKLYLRQLPE PLI S FRL YHELVGLAK 
DSLKAEAEAKAASRGRQDGSESEAVAVALAGRLRELL 
RDLPPENRASLQYLLRHLRRIVEVEQDNKMTPGNLGI 
VFGPTLLRPRPTEATVSLSSLVDYPHQARVIETLIVH 
YGLVFEEEPEETPGGQDESSNQRAEWVQVPYLEAGE 
A WY PLOEAAADGCRE SRWSNDS DSDLEEAS ELLS S 
SEASALGHLSFLEQQQSEASLEVASGSHSGSEEQLEA 
TAREDGDGDEDGPAQQLSGFNTNQSNNVLQAPLPPMR 
LRGGRMTLGSCRERQPEFV 


2006 


A 


•a 
J 


GO ft 
o Z 0 


S VGALDT F I AAVYE HA VI L PNRAET P VS KE EALLLMN 
KNIDVLEKAVKLAAKQGAHI I VTPEDGI YGWI FTRES 
I YPYLEDI PDPGVNWI PCRDPWRNH*NI VSLRKCLLN 
\ RFGNT PVQQRLS CLAKDNS I YWANIGDKKPCNASD 
SQC PPDGRYQYNTD WFDSQGKLLARYHKYNLFAPE I 
QFDFPKDSELVTFDTPFGKIGIIT 


2007 


A 


1375 


1453 


RTFTS*CSVSCGRGVQQRHVGCQIGTHKIARETECNP 
YTRPESERDCQGPRCPLYTWRAEEWQEVSRATKGYLP 
GI SRVRPLLSSHLFPI KPEKS PST VTMLALSQKVHCQ 
TRAFAPTRVGELLVFKQFL 


2008 


A 


2679 


1435 


LLSTYIKFINLFPETKATIQGVLRAGSQLRNADVELQ 
QRAVE YLTLS S VASTDVLATVLEEM PPF PERES S ILA 
m ,KR KKfSPaAGS ALDDGRRD PS SNDINGGME PTPSTV 
STPS PSADLLGLRAAPPPAAP PASAGAGNLLVDVFDG 
PAAQPSIiGPTPEEAFLSPGPEDIGPPI PEADELLNKF 
VC KNNGVLFENQLLQ I GVKS E FRQNLGRMYL FYGNKT 
SVQFQNFS PT WHPGDLQTQLAVQTKRVAAQVDGGAQ 
VQQVLNIECLRDFLTPPLLSVRFRYGGAPQALTLKLP 
VTINKFFQPTEMAAQDFFQRWKQLSLPQQEAQKI FKA 
NHPMDAEVTKAKLLGFGSALLDNVDPNPENFVGAGI I ! 
QTKALQVGCLLRLE PNAQ AQM YRLTLRTS KE PVSRHL 
CELLAQQF 


2009 


A 


153 


1994 


MGALRPTLLPPSLPLLLLLMLGMGCWAREVLVPEGPL 
YRVAGTAVS I SCNVTGYEGPAQQNFEWFLYRPE APDT 
ALGIVSTKDTQFSYAVFKSRWAGEVQVQRLQGDAW 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *= s Stop codon, 
/-possible nucleotide deletion,=possibIe nucleotide 
insertion) 










LKIARLQAQDAGIYECHTPSTDTRYLGSYSGKVELRV 
LPDVLQVSAAPPGPRGRQAPTS PPRMTVHEGQELALG 
CLARTSTQKHTHLAVSFGRSVPEAPVGRSTLQEWGI 
RSDLAVEAGAPYAERIjAAGELRLGKEGTDRYRMWGG 
7\ r\ a r«n a ptvupt a A dmt rvnOFlf^Q WAOT A A VT . ATJV 
DVQTLSSQLAVTVGPGERRIGPGEPLELLCNVSGALP 
PAGRHAAYSVGWEMAPAGAPGPGRLVAQLDTEGVGSL 
GPGYEGRHI AME KVASRTYRLRLE AARPGDAGT YRCL 
AKAYVRGSGTRLREAASARSRPLPVHVREEGWLEAV 
AWLAGGTVYRGETASLLCNISVRGGPPGLRLAASWWV 
ERPEDGELS SVPAQLVGGVGQDGVAELGVRPGGGPVS 
VELVGPRSHRLRLHSLGPEDEGVYHCAPSAWVQHADY 
S WYQ AGS ARSG PVTVYP YMHALDTLFVPLLVGTGVAL 
VTGATVLGTITCCFMKRLRKR* 


2010 


A 


153 


1994 


MGALRPTLLPPSLPLLLLLMLGMGCWAREVLVPEGPL 
YRVAGTAVSISCNVTGYEGPAQQNFEWFLYRPEAPDT 
ALGI VSTKDTQF S YAVFKSRWAGEVQVQRLQGDAW 
LKIARLQAQDAGIYECHTPSTDTRYLGSYSGKVELRV 
LPDVLQVSAAPPGPRGRQAPTS PPRMTVHEGQEIjALG 
CLARTSTQKHTHLAVSFGRSVPEAPVGRSTLQEWGI 
RSDLAVEAGAPYAERLAAGELRLGKEGTDRYRMWGG 
a n a n n zv r^T YWPT A A P WT OH PDfJS W AO I AE KRAVLAHV 
DVQTLS SQLAVT VGPGERRI G PGE PLELLCNVSGALP 
PAGRHAAYSVGWEMAPAGAPGPGRLVAQLDTEGVGSL 
GPGYEGRH I AME KVAS RT YRLRLE AAR PGD AGT YRCL 
AKAYVRGSGTRLREAASARSRPLPVHVREEGWLEAV 
AWLAGGTVYRGETASLLCNI S VRGGP PGLRLAASWWV 
ERPEDGELS SVPAQLVGGVGQDGVAELGVRPGGGPVS 
VELVGPRSHRLRLHSLGPEDEGVYHCAPSAWVQHADY 
SWYQAGSARSGPVTVYPYMHAIiDTLFVPLLVGTGVAL 
VTGATVLGTITCCFMKRLRKR* 


2011 


A 


153 


1994 


MGALRPTLLPPSLPLLLLLMLGMGCWAREVLVPEGPL 
YRVAGTAVS I SCNVTGYEGPAQQNFEWFLYRPEAPDT 
ALG I VSTKDTQFS YAVFKSRWAGEVQVQRLQGDAW 
LKIARLQAQDAGIYECHTPSTDTRYLGSYSGKVELRV 
LPDVLQVSAAPPGPRGRQAPTS PPRMTVHEGQELALG 
CLARTSTQKHTHLAVSFGRSVPEAPVGRSTLQEWGI 
RSDLAVEAGAPYAERLAAGELRLGKEGTDRYRMWGG 
a r» a r 1 n a r"v vupt a A T? W T nn Pnn<3 W AHT A E KR AVLAHV 
DVQTLS SQLAVTVGPGERRIGPGEPLELLCNVSGALP 
PAGRHAAYSVGWEMAPAGAPGPGRLVAQLDTEGVGSL 
G PGYEGRHI AMEKVASRTYRLRLEAARPGDAGTYRCL 
AKAYVRGSGTRLREAASARSRPLPVHVREEGWLEAV 
AWLAGGTVYRGETASLLCNI S VRGGPPGLRIjAASWWV 
ERPEDGELSSVPAQLVGGVGQDGVAELGVRPGGGPVS 
VELVGPRSHRLRLHSLGPEDEGVYHCAPSAWVQHADY 
SWYQAGSARSGPVTVYPYMHALDTLFVPLLVGTGVAL 
VTGATVLGTITCCFMKRLRKR* 


2012 


A 


153 


1994 


MGALRPTLLPPSLPLLLLLMLGMGCWAREVLVPEGPL 
YRVAGTAVS I SCNVTGYEGPAQQNFEWFLYRPEAPDT 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










ALGIVSTKDTQFSYAVFKSRWAGEVQVQRIiQGDAW 
LKIARLQAQDAGIYECHTPSTDTRYLGSYSGKVELRV 
LPDVLQVSAAPPGPRGRQAPTS PPRMTVHEGQELALG 
CLARTSTQKHTHLAVS FGRS VPEAPVGRSTLQE WGI 
RSDLAVEAGAPYAERLAAGELRLGKEGTDRYRMWGG 
AQAGDAGTYHCTAAEWIQDPDGSWAQIAEKRAVLAHV 
DVQTLSSQLAVTVGPGERRIGPGEPLEIiLCNVSGALP 
PAGRHAAYSVGWEMAPAGAPGPGRLVAQLDTEGVGSIi 
GPGYEGRH I AME KVAS RT YRLRLEAARPGDAGTYRCL 
AKAYVRGSGTRLREAASARSRPLPVHVREEGWLEAV 
AWLAGGTVYRGETASLLCNI S VRGGPPGLRLAASWWV 
ERPEDGELSSVPAQLVGGVGQDGVAELGVRPGGGPVS 
VELVGPRSHRLRLHSLGPEDEGVYHCAPSAWVQHADY 
SWYQAGSARSGPVTVYPYMHALDTLFVPLLVGTGVAL 
VTGATVLGTI TCCFMKRLRKR* 


2013 


A 


1273 


480 


YLRLWLRHFDPRHPHGVPLPTEPSTPKSPSAGPSPHL 
LHPGTPGHPSASPPSRPPSSSTPKRPRTAGRNPKRRQ 
SSPGRPT/NPGLRKKMGPPSEG\SGGGNTPQGPASGP 
ASLLPNPC* LCRGKPLGVLRGGGRRGASVPESWPHI P 
APN AG * GHAQRD PGG AGQ PKD * GGRGAPGQQATE ADS 
GPAA\GMRGPHI IQLDTPLSASRGMRNARGTFGM/ PS 
LPRGDLSPSSAGHPPASVTLPQGPHFPKGTLAPGTIiP 

PALFGDQEL 


2014 


A 


853 


1553 


KKKETVSVSSREVRETSKALERPKLQE*PRGPALQSR 
ATS PRNTYQRPAGWPQAE P PQ * GNRLF PAGVRGRAPG 
PHPRA*WSQPPAEDPTGRAETQLCPPAALARAQPRRQ 
LCGPALPGPRRP/ PTRTPT * SGRGFS KWLAPEITQGP 
APN\ PFGFSDVLFCVFFKPFSLFR* *KNL*KTLLTNQ 
PEPQEPKGCGGVWRPHYVSGLLPTLKPCSLKREGPRP 
ALPPS / SPS PPPLCPSLRSPPASL/ PPVI LAFRVPWR 
FP* PPVKIQRLSPFFFNFDN* / PSVSFSKFYFSNHPG 
QPPALI PSRPGLSGPPFHTLRFETAVFPTFAAGMAVS 
CPCLPIWPIPQPWGPGSLPQPPPLLMP*KLGPRPCWP 
EPQMPSSGSLT/SGPNSSGLGIGPPYPGSPPWGQ*KG 
KAFILANRPHHPLLPGPPCRDGLSLP/RPLLSVCGSR 
TLCPS PGASAVTRLLKMNS * I LPAHPRPDPWS WPPSS 
PVPETSTP*R*TLGPPTSRTCRPEV\PWALPPANWAT 
SFPPLTLG/ VPHPLQGDYS PDPTPVSPHGPLLN 


2015 


A 


527 


871 


VWS PDRPSS SDPRGQRRRPTGRVAADPGAAPPAAAAA 
PPPSSA*TAPGSCRRWRTSSRRPTPGSNPRPTPPRPR 
SRATSP/TPDSAQRLPPPPPPAGPG\PPGPEAPPVSL 

GQPFCR 


2016 


A 


17 


941 


PLDRAVEFAVGSGRPRRISCLSCPGGGGAASGLQRAA 
GGTGLSWVPAGLRVCCSQRSERPEKEEQPVQNPRRKG 
KGGEISTWKNSSMKMKECLRIKER*TMKNSHRTRESQ 
K* LVFWKTRS * KTRETQKTRARELRNR* RI KKSQRVR 
ERQKEKESQRGRESQRCREDQRQRESQREGEGQRVKE 
SQTWVRE PES EGEPE S ETRAAGKRPAEDDI PRKAKRK 
TNKGLAQ YLKQ YKE AI HDMNFSNEDMI RE FDNMARVE 
DICRRKSKQKLGAFLWMQRNLQDPFYPRGPREFRGGCR 
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TABLE 7 



ID 


Memo a 


IT ICUICICU 

beginning 

UUvlCU nuc 

location of 
first amino 
acid residue 
of peptide 
sequence 


PrpH ict Pfl 

ending 

niiflpntidp 

UUVlCUUUv 

location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid seauence (X=TJn known, *=Stop codon, 
/=possibIe nucleotide deietion,=possible nucleotide 
insertion} 










APRRDTEDIPYV 


2017 


A 


o o c 

335 


±Z\J 


MPT.T.T.PPT.MTrnPTKVPPTLiLLiHIFCLSTCLFLGLHIC 
AS FHARALLETALI LLRMKI AGFQVI LF PQD F VL * 


2018 


A 


3 


800 


FVLDPYSGVIKSNVSFDREQQSSYTFDVKATDGGQPP 
RS STAKVTINVMDVNDNS PWI S PPSNTSFKLVPLS A 
I PGS WAEVFAVDVDTGMNAE LKYTI VSGNNKGLFRI 
D PVTGN I TLEEKPAPTDVGLHRLWNI SDLGYPKSLH 
TLVLVFLYVNDTAGNAS YI YDLI RRTMETPLDRNI GD 
SSQPYQNEDYLT IMIAI IAGAMWI WIFVTVLVRCR 
HASRFKAAQRSKQGAEWMS PNQE1\TCQNKKKKRKKRKS 
PKSSLLN 


2019 


A 


1 


1331 


GWNGSWNDNLVDTSPLKRDPLQDI CRRYMEDLKKICF 
YRELNSKTTLKFVHTSFHGVGHDYVQLAFKVFGFKPP 
T P XI P PHTTn PD PDP c TVFCC PNP E E GE S VLEL S LRLAE K 
ENARVVLATDPDADRLAAAELQENGCWKVFTGNELAA 
LFGWWMFDCWK^KSRNADVKNVYMLATTVSSKILKA 
IALKEGFHFEETLPGFKWIGSRI IDLLENGKEVLFAF 
EES IGFLCGTSVLDKDGVSAAVVVAEMAS YLETMNIT 
LKQQLVKVYEKYGYHI S KTSYFLCYE PPTIKSI FERL 
RNFDSPKEYPKFCGTFAILHVRDVTTGYDSSQPNKKS 
VLPVSKNSQMITFTFQNGCVATLRTSGTEPKIKYYAE 
MCASPDQSDTALLEEELKKLIDALIENFLQPSKNGTG 
SGRSCLGVPPNTVMTLCGAYGNRATRRNCHTLEPCG 


2020 


A 


1 


2337 


TRFRGLRPAVAPWTALLALGLPGWVLAVSATAAAWP 
EQHASVAGQHPLDWLLTDRGPFHRAQEYADFMERYRQ 
GFTTRYRIYREFARWKVNNLALERKDFFSLPLPLAPE 
FIRNIRLLGRRPNLQQVTENLI KKYGTHFLLSATLGG 
EE SLTI FVD KQKLGRKTETTGGAS 1 1 GGSGNSTAVSL 
ETLHQLAASYFIDRESTLRRIjHHIQIATGAIKVTETR 
TGPLGCSNYDNLDSVSSVLVQSPENKVQLLGLQVLLP 
EYLRERFVAAALSYITCSSEGELVCKENDCWCKCSPT 
FPECNCPDADIQAMEDSLLQIQDSWATHNRQFEESEE 
FQALLKRLPDDRFLNSTAI SQFWAMDTSLQHRYQQLG 
AGLKVLFKKTHRILRRLFNLCKRCHRQPRFRLPKERS 
LSYWWNRIQSLIiYCGESTFPGTFLEQSHSCTCPYDQS 
QroHPT PrAT^EGPACAHCAPDNSTRCGSCNPGYVLA 
QGLCRPEVAESLENFLGLETDLQDLELKYLLQKQDSR 
IEVHSIFISISTOMRLGSWFDPSWRIO^LTLKSNKYKP 
GLVHVMIxALSLQICLTKNSTIjEPVMAIYVNPFGGSHS 
t? a wttm P\7Ml?n Q T7 PTO/JE! P TMVD A A AOPONWT I TJjGNRW 

Hi O W 17 1*1 XT V IN HiVJtD 17 iriJMEaXv ±Vi V UT\nn\^\^>^Vt n J. J. ijju»i\ii 

K^FFETVHVYLRSRIKSLDDSSNETIYYEPLEMTDPS 
KNLGYMKINTL\QVFGYSLPFDPD\AIRDLILQLDYP 
YTQGSQDS ALLQLI ELRDRVNQLS PPGKVRLDLFSCL 
LRHRLKIjAIWEVGRIQSSLRAFNSKLPNPVEYETGK^ 

cs 


2021 


A 


161 


547 


PAGIGRSTAKTPGTPGSLEMENLKSGVYPLKEASGCP 
GADRNLLVYSFYEKGPLTFRDVAIEFSLEEWQCLDTA 
QQDLYRKVMLENYRNLWLAGIAVSKPDLITCLEQGK 
EPWlsnyiKRHAMVDQPPGR 


2022 


A 


161 


547 


PAGIGRSTAKTPGTPGSIjEMENLKSGVYPLKEASGCP 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possible nucleotide 
insertion) 










GADRNLLVYSFYEKGPLTFRDVAIEFSIiEEWQCLDTA 
QQDLYRKVMLENYRNLVFLAGIAVSKPDLITCLEQGK 
EPWNMKRHAMVDQPPGR 


2023 


A 


3 


452 


AVPGPGFGLS PTMVTLAELLVLLAALLATVSGY\ FVS 
IDAHAEECFFERVTSGTKMGLIFEAEDGGFLDIDWI 
TLPDR / RKI KPRIjIjKKKGQ * TYRS FMDVTFKLC YNLR 
MSWMNPNIRNHNHWLLLTSIKFLITQFRSSLSYLSSC 

IOSE 


2024 


A 


31 


1312 


ITTVMAGKRSGWSRAALLQLLLGVNLGVMPPTRARSL 
RFVTLIjYRHGDRSPVKTYPKDPYQEEEWPQGFGQLTK 
EGMLQHWELGQALRQRYHGFLNTSYHRQEVYVRSTDF 
DRTLMSAEANLAGLFPPNGMQRFNPNI SWQP I PVHTV 
PITEDRLLKFPLGPCPRYEQLQNETRQTPEYQNESSR 
NAQFLDMVANETGLTDLTLETVWNVYDTLFCEQTHGL 
RLPPWAS PQTMQRLSRLKDFS FRFLFGI YQQAEKARL 
QGGVLLAQI RKNLTLMATTSQLPKLLVYS AHDTTLVA 
LQMALD VYNGEQAP YAS CHI FELYQEDSGNFS VEMYF 
RNESDKAPWPLSLPGCPHRCPLQDFLRLTEPWPKDW 
QQECQLASGPADTEVI VALAVCGS I LFLLI VLLLTVL 
FRMQ AQP PG YRHVADGEDHA 


2025 


A 


2 


317 


FVDSPRFRATIDEVETDWEI EAKLDKLVKLCSGMVE 
AGKAYVSTSRLFVSGVRDLSQQCQGDTVISECLQRFA 
DSLQEWNYHM I LFDQ AQRS VRQQLQS FVKE 


2026 


A 


1788 


3 


RTRGRFPKRTP / L FQI S S AVQKEQPLPTAE I TRLAVW 
AAVQAVERKLEAQAMRLLTLEGRTGTNEKKIADCEKT 
AVEFANHLE S KWWLGTLLQE YGIjLQRRIjENMENIjLK 
NRNFWILRLPPGSNGEVPKVPVTFDDVAVHFSEQEWG 
NLSEWQKELYKNVMRGNYESLVSMDYAI SKPDLMSQM 
ERGERPTMQEQEDSEEGETPTDPSAAHDGIVIKIEVQ 
TNDEGSESLETPEPLMGQVEEHGFQDSELGDPCGEQP 
DLDMQEPENTLEEST / DRLQRVQRTEADAGAAEELHG 
/VGS/WIKTEEQDEEEEEEEEDELPQHLQSLGQLSGR 
YEASMYQTPLPGEMS PEGEES PPPLQLGNPAVKRIjAP 
SVHGER / PPERE PRGLEPAAAE PARRAALHMHGVRQE 
LPP / GRSTS S STS ATTSRRGPTS APNARSASGTSNSS 
RCTSASTACAEAASHPN/ cgptfnpkhalkprpksps 
SGSGGGGPKPYKCPECDSSFSHKSSLTKHQITHTGER 
PYTCPECKKSFRLHI SLVIHQRVHAGKHEVS FI CSLC 
GKS FS RPSHLIjRHQRTHTGER PFKC PE CEKS FS EKS K 
LTNHCRVHS RERP 


2027 


A 


2193 


442 


ELNCNIRAPPKQMFWCFRPRSKERAVWAWERRLMW 
GDAPESIQFVLDEDSYLVPELLX5VRIFSRSTHEFLHE 
VPAASEEIFKIASMAPGALLLEAQKEYEKESQKADEY 
LREIQELGQLTQAVQQCIEAAGHEHQPDMQKSLLRAA 
S FGKC FLDRF P PDS FVHMCQDLRVLNAVRDYHI GI PL 
TYSQYKQLTIQVLLDRLVLRRLYPLAIQICEYLRLPE 
VQGVSRILAHWACYKVQQKDVSDEDVARAINQKLGDT 
PGVSYSDIAARAYGCGRTELAIKLLEYEPRSGEQVPL 
LLKMKRSICLALSKAIESGDTDIiVFTVLLHLKNELNRG 
DFFMTLRNQPMALSLYRQFCKHQELETLKDLYNQDDN 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










HQELGS FHI RASYAAEERI EGRVAALQTAADAF YKAK 
NEFAAKATEDQMRLLRLQRRLEDELGGQFLDLSLHDT 
VTTLI LGGHNKRAEQIiARDFRI PDKRLWWLKLTALAD 
LEDWEELEKFSKSKKSPIGYLPFVEICMKQHNKYEAK 
KYAS RVGPEQKVKALLLVGDVAQAADVAI EHRNEAEL 
SLVXiS HCTGATDGATAD KI QRARAQAQKK 


2028 


A 


110 


277 


MLLALPLAAPSCPMLCTCYSSPPTVSCQANNFSSVPL 
SLPPSTQRLFLQNNLIRTL 


2029 


A 


1 


359 


I SGE S I YWS QKPTP S SNAS PWSE PAAVDVELTAYALL 
AQLTKPSLTQKE I AKATS I VAWLAKQRNAYGGFS STQ 
DTWALQALAKYATTAYVPSEE I NLWKSTENFQRTF 
NIQAVNRM 


2030 


A 


16 


255 


ARPSCPCSWSFSCCGVSPGA/LVTEAAIFYETQPSLW 
AESESLLKPLAKLMTYFKNSTYLIRLFMIYRCKPVKS 

KKKKRN 


2031 


A 

■-. 


2 


414 


GKTHTATWELNPWVE YEFRWASNKI GGGE P S LPS E 
KVRTEEAVPEVPPSEVNGGGGSRSELVITWDPVPEEL 
QNGEGFGYWAFRPLGVTTWIQTVVTSPDTPRYVFRN 
E S I VPYS P YEVKVGVYNNKGEGPFS P 


2032 


A 


3 


438 


SNLHHLILNNNQLTLI SSTAFDDVFALEELDLS YNNL 
ETI PWDAVE KMVSLHTLSLDHNMI DNI PKGTFSHLHK 
MTRLDVTSNKLQKLPPDPLFQRAQVLATSGI ISPSTF 
ALSFGGNPLHCNCELLWLRRLSREDDLETCASPP 


2033 


A 


3 


438 


SNLHHLILNNNQLTLI SSTAFDDVFALEELDLSYNNL 
ETI PWDAVE KMVSLHTLSLDHNMIDNI PKGTFSHLHK 
MTRLDVTSNKLQKLPPDPLFQRAQVLATSGIISPSTF 
ALSFGGNPLHCNCELLWLRRLSREDDLETCASPP 


2034 


A 


166 


4280 


AS DQSGSQ PGDHS AGQ ANQLKLEDMKS PRRTTLCLMF 

IVIYSSKAALNWNYESTIHPLSLHEHEPAGEEALRQK 

RAVATKSPTAEEYTVNIEISFENASFLDPIKAYLNSL 

SFPIHGNNTDQITDILSINVTTVCRPAGNEIWCSCET 

GYGWPRERCLHNLICQERDVFLPGHHCSCLKELPPNG 

PFCLLQEDVTLNMRVRLNVGFQEDLMNTSSALYRSYK 

TDLETAFRKGYGILPGFKGVTVTGFKSGSVWTYEVK 

TTPPSLELI HKANEQVVQS LNQTYKMDYNS FQAVT IN 

ESNFFVTPEI IFEGDTVSLVCEKEVLSSNVSWRYEEQ 

QLEIQNSSRFSIYTALFNNMTSVSKLTIHNITPGDAG 

EYVCKLILDIFEYECKKKIDVMPIQILANEEMKVMCD 

NNPVSLNCCSQGNVNWSKVEWKQEGKINIPGTPETDI 

DSSCSRYTLKADGTQCPSGSSGTTVI YTCEFI SAYGA 

RGSANIKVTFISVANLTITPDPISVSBGQNFSIKCIS 

DVSNYDEVYWNTSAGI KI YQRFYTTRRYLDGAE SVI/T 

VKTSTREWNGTYHC I FRYKNSYS I ATKDVI VHPLPLK 

LNIMVDPLEATVSCSGSHHIKCCIEEDGDYKVTFHMG 

S S S L PAAKE VNKKQVC YKHNFNAS S VS WCS KTVDVCC 

HFTNAANNSVWSPSMKLNLVPGENITCQDPVIGVGEP 

GKVIQKLCRFSNVPS S PEE / S PLGGTIT YKCVGSQWG 

\EKRNDCI SAPINSLLQMAKALI KSPSQDEMLPTYLK 

DLSISIDKAEHEISSSPGSLGAIINILDLLSTVPTQV 

NSEMMTHVLST\mVILGKPVLNTWKV3^QQWTNQSSQ 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










LLHSVERFSQALQSGDS ppls fsqtnvqmsstvi kss 
HPETYQQRFVFPYFDLWGNWIDKSYLENLQSDS S IV 
TMAFPTLQAILAQDIQENNFAESLVMTTTVSHNTTMP 
FRISMTFKNNSPSGGETKCVFWNFRLA3STNTGGWDSSG 
CYVEEGDGDNVTCICDHLTSFSILMSPDSPDPSSLLG 
ILLDI I S YVGVGFSILSLAACLWEAWWKSVTKNRT 
SYMRHTCIVNIAASLL\VANTWFIGVAAIQDNRYILC 
KTACVAATFFIHFFYLSVFFWMLTLGLMLFYRLVFIL 
HETSRSTQKAIAFCLGYGCPLAISVITLGATQPREVY 
TRKNVC WLNWEDTKALLAFAI PALI I VWNI TI TI VV 
ITKILRPSIGDKPCKQEKSSLFQISKSIGVLTPLLGL 
TWGFGLTTVFPGTNLVFHI I FAI LNVFQGLFI LLFGC 

MSSPISRRFNNLFGKTGTYNVSTPEATSSSLENSSSA 
SSLLN 


2035 


A 


1 


366 


Ar KbDbKLAJiriy K V H iKjtttiir X 1 LJN JiCvjJvv r o X JVA I JjA 
CHQKLHTGEKLYECEECDKVYIRKSHLERHRRIHTGE 
KPHKCGDCGKAFNSPSHLIRHQRIHTGQKSYKCHQCG 
KVFSLRSLLAE 


2036 


A 


2 


236 


I SGQEGLQAVLASD YS FAQFRYLQRLLLVHGRWS YFR 
MCKFLCYFFYKNFAFTLVHFWFGFFCGFSAQTVYDQW 
FITL t 


2037 


A 


706 


951 


MRCGWGPLGCLGTGAPAGWMVLGS PRSQLQRARWSRA 
SLSAFGWEIRLRPECaPKAIPRQ JjJjJj VALboJi 1 Jjvj VttvMj 
ATPLHCL* 


2038 


A 


1242 


433 


PGSPDVNRAWRPPPPPPPPPPAPQPTMSRRKQGKPQ 
HLS KRE FS PE PLEAI LTDDE PDHGPLGAPEGDHDLLT 
CGQCQMNFPLGDILI FIEHKRKQCNGSLCLEKAVDKP 
PS PS PI EMKKASNPVEVGIQVTPEDDDCLSTS SRGI C 
PKQEHI ADKLLHWRGLSSPRSAHGALI PTPGMSAEYA 
PQGICKDEPSSYTCTTCKQPFTSAWFLLQHAQNTHGL 
RIYLESEHGSPLTPRVLHTPPFGWPRELKMCGSFRM 
EAREPLSSEKI 


2039 


A 


2009 


1889 


MHS AMLGTRVNL S VS D F WRVMMRVCWL VRQDS RHQRI 
RLPHLEAWIGRGPETKITDKJCCSRQQVQIjKAKOWKvj 
YVKVKQVGVNPTS IDS VVIGKDQEVKLQPGQVLHMVN 
ELYPYIVEFEEEAKNPGLETHRKRKRSGNSDSIERDA 
AQEAEAGTGLEPGSNSGQCSVPLKKGKDAPIKKESLG 
HWSQGLKI SMQD PKMQVYKDEQVWI KDKYPKARYHW 
LVLPWTS I S S LKAVARGTP * T P * AYAHCGGKGDCRFC 
W\ «?c?Ta.PPT?T.nYHAIPSMSHVHIjHVISODFDSPCLKN 
KKHWNSFNTEYFLESQAVIEMVQEAGRVTvTUDGMPEL 
LKLPLRCHECQQLLPSI PQLKEHLRKHWTQ* FFFFTV 
LSKFILREKESSGSTQLFHSPTTFPCIRTYAVIVS 


2040 


A 


2009 


1889 


MHSAMLGTRVNLS VS DFWRVMMRVCWLVRQDSRHQRI 
RL PHLEAWI GRG PETRI TDKKCSRQQVQLKAE CNKG 
YVKVKQVGVNPTS I DS WI GKDQEVKLQPGQVLHMVN 
ELYPYIVEFEEEAKNPGLETHRKRKRSGNSDSIERDA 
AQEAEAGTGLEPGSNSGQCSVPLKKGKDAPIKKESLG 
HWSQGLKI SMQDPKMQVYKDEQVWT KDKYPKARYHW 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,-possible nucleotide 
insertion) 










LVLPWTS I S S LKAVARGT P * T P * AYAHCGGKGDCRFC 
W\ S S KLRFRLGYHAI PSMSHVHLHVT SQDFDS PCLKN 
KKHWNSFNTEYFLESQAVIEMVQEAGRVTVRDGMPEL 
LKLPLRCHECQQLLPSI PQLKEHIiRKHWTQ* FFFFTV 
LSKFILREKESSGSTQLFHSPTTFPCIRTYAVIVS 


2041 


A 


2009 


1889 


MHS AMLGTRVNLS VS DFWRVMMRVCWLVRQDSRHQR I 

YVKVKQVGVNPTS IDS WIGKDQEVKLQPGQVLHMVN 
ELYPYIVEFEEEAKNPGLETHRKRKRSGNSDSIERDA 
AQEAEAGTGLEPGSNSGQCSVPLKKGKDAPIKKESLG 
HWSQGLKI SMQDPKMQVYKDEQVWI KDKYPKARYHW 
LVLPWTSISSLKAVARGTP*TP*AYAHCGGKGDCRFC 
W\SS KLRFRLGYHAI PSMSHVHLHVISQDFDSPCLKN 
wutaTki c PKP?gVT?T ,pcna VT iTMVnR ahr VTVRDGM PE L 
LKLPLRCHECQQLLPS I PQLKEHLRKHWTQ* FFFFTV 
LSKFILREKESSGSTQLFHSPTTFPCIRTYAVIVS 


2042 


A 


1464 


775 


KMTTAARPTFEPARGGRGKGEGDLSQLSKQYSSRDLP 
SHTKI KYRQTTQDAPEEVRNRDFRRELEERERAAARE 
KNRDRPTREHTTSSS VSKKPRLDQI PAANLDADDPLT 
DEEDEDFEEESDDDDTAALLAELEKIKKERAEEQARK 
EQEQKAEEERIRMENILSGNPLLNLTGPSQPQANFKV 
xro p wnnn\ a/t? kntp A KfivnDOKKDKRFVNDTLRS E FHK 
KFMEKYIK 


2043 


A 


2 


860 


ATTRIRLSGGRSQHEGRVEVQIGGPGPLRWGLICGDD 
WGTLEAMVACRQLGLGYANHGLQETWYWDSGNI TEW 
MSGVRCTGTELSLDQCAHHGTHITCKRTGTRFTAGVI 
CSETASDLLLHSALVQETAYI EDRPLHMLYCAAEENC 
LASSARSANWPYGHRRLLRFSSQIHNLGRADFRPKAG 
P M QWVWWRP HGHYHSMDF FTH YD I LTPNGTKVAEGHK 
AS FCLEDTECQEDVS KRYECANFGEQGI TVGCWDLYR 
HDIDCQWIDITDVKPGNYILHGVINPT 


2044 


A 


973 


266 


ARGS LCAPAS PLYPVNQIjRNVALAQALTPYVFLSDI D 
FLPAYSLYD YLRAS I EQLGLGSRRKAALWPAFETLR 
YRFSFPHSKV^LlxALLDAGTLYTFRYHEWPRGHAPTD 
YARWREAQAPYRVQWAANYEPYVVVPRDCPRYDPRFV 
GFGWNKJfAHIVELDAQEYELLVLPEAFTIHLPHAPSL 
DISRFRSSPTYRDCLQALKDEFHQDLSRHHGAAALKY 
LPALQQPQS PARG 


2045 


A 


1668 


218 


AWRAQGSRGFSGAGWRPRQAAAMNFSEVFKLSSLLC 
KFS PDGKYLAS CVQYRLWRDVNTLQI LQLYTCLDQI 
QHIEWSADSLFILCAMYKRGLVQVWSLEQPEWHCKID 
EGS AGLVAS CWS PDGRHILNTTEFHLRITVWSLCTKS 
VSYIKYPKACLQGITFTRDGRYMALAERRDCKDYVSI 
FVCSDWQLLRHFDTDTQDLTG I EWAPNGCVIxAVWDTC 
LEYKILLYSLDGRLLSTYSAYEWSLGIKSVAWSPSSQ 
FIjAVGSYDGKVRIIxNHVTWKMITEFGHPAAINDPKIV 
VYKEAEKSPQLGLGCLSFPPPRAGAGPLPSSESKYEI 
AS VPVSLQTLKPVTDRANPKI GIGMLAFS PDS yflat 
RNDNI PNAVWVWDI QKLRLFAVLEQLS PVRAFQWDPQ 
QPRLAI CTGGSRLYLWS PAGCMS VQVPGEGDFAVLSL 
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612 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
Beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 

anH Sno 

euuuig 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *= s Stop codon, 
/=nnssible nucleotide deletion =nossible nucleotide 
insertion) 










CWHLSGDSMALLSKDHFCLCFIjETEAVVGTACRQLGG 
HT 


2046 


A 


231 




Q D T V Q T7 T . T? T7MMP" TNT P CJVCITT *3 & T ^ T T J i AR S S REROLS 

SEGRFSWRIi*DASSGERS*RRSESSSWLSS*ERESSV 

MPKSDI*LFPQSTFSEPSESACACGDFPSLSVRSGCC 
SSFNSLFSSWSVGNASEASRSGKRSSFL*ACEYLPSE 
INAGGI RSQ PGE I NGS VFDLLERNTLGS SAMPS I LAT 
SWQASV*ASCKRLSSSQASSEESGPDGLPAVSEDWVW 
SANVASALQSSSSMV7SFPAVTERLGESVC\ SPSDDSR 
DCSPGAPLYVGFLYLTLCRDKFYSLKMKKNKLLKIQN 
NTLYRKE KKGHMNMCNTAI F 


2047 


B 


26 


175 


NCGSGDILLKIVKVEHEEMPEAKNVIAVLEEFMKEAL 
DQSF 


2048 


A 


1 


1386 


RDFVAASSRRRRADFPRMTELRQRVAHEPVAPPEDKE 
SESEAKVDGETASDSESRAESAPXjPVSADDTPEVLNR 
ALSNLSSRWKNWWVRGILTLAMIAFFFI IIYLGPMVL 
fix J. VWL vy J.l\.L.r rLCiJ. J. J. .LoiiN v x no iLfurnrn-iiJoni 
FLLSVNYFFYGETVTDYFFTLVQREEPLRILSKYHRL 
I S FTLYIiIGFCMFVLSLVKKHYRI^FYMFGWTHVTIjL 
IWTQSHLVIHNLFEGMIWFIVPISCVICNDIMAYMF 
nwRRrZP TPT .T TCT PKKTWEGF TGGFFAT WFGLLLS Y 

\JC V V VjJtv-L ir JUXIvuOIr £\J\X Hour IVJUf f ax v v l vjuduu j. 

VMSGYRCFVCPVEYNNDTNSFTVDCEPSDLFRLQEYN 
I PGVIQSVIGWKTVRMYPFQIHS I ALSTFASLIGPFG 
GFFASGFKRAFKI KDFANTI PGHGGIMDRFDCQ YLMA 
TFVNVYIASFIRGPNPSKLIQQFLTLRPDQQLHIFNT 
LRSHLIDKGMLTSTTEDE 


2049 


A 


2 


427 


HS WVSRS CAFE PAWEEGATSQTVATCGGEAVCVI DCQ 
TGI VLHKYKAPGEE FFS VAWTALMVVTQAGHKKRWSV 
LAAAGLRGLVRLLHVRAGFCCGVI RAHKKAI ATLCFS 
P RMRTWT .FT A Q YDKR 1 1 LWDIGVPNODYE FO 


2050 


A 


1 


892 


RTRGRTRGRGTRGGGGGGGTGAGGRGEGSQVPGLSAA 
DQDR * GRGCCS PGGRDRAGGGGGI GQGGDAERRRGEQ 
GEGWGRT PGQKPGRGEAPLWKGRV* GPRWRGGPEAA 
GAAAAQRPPGPVPFPAGGAEPLPALQPI PAAQDLRGA 
AQKEGPGGR*GG* PGRRGRGPRERASVPAPSGHAGGA 
EEAAGRRPAWPPGAGPVEAAVPGEAHQGGEGVATLP 
GTQE AGGDAGHGQLSDEGRAPGCSARGGADPGVGG* K 
GEGDERRAAGEHSAEAEPGAF * NQDEDPGGPDPGSAS 
Y 


ZVJDL 


A 


o 

4S 


lUOO 


FVLCAGACWPLRDRDT / SPPAHLCPEVTPWSLHVPIS 
LQCPPRLCS PPTHRLTPPAGCQRPPPAGPLSVAPASL 
S PS APALLEA/TS PPWTAGATWS PGRS PATQCWPPS W 
CQTPFPHPETGQLCLVRSLH* PHLS SLGQAGAAG* GG 
PLAPPFPPFLVPFP\P*QVQHPRSPA*GAGPEPAVNI 
PQPL/PVPPWD*PLTSPPNSTGAPSWPRAGSVSPSP/ 
VLE PRPEQLSGRQGCS S VS S WGAPGGATDRQAAQGPG 
HPSPGRCCPRRlVLGNEPPAGFGIjRSLWPRSPPHEVG 
ARLPNGAFGFSVRCLLCFPPWRAEPPHIRIGRATPPG 
PGP/ VPSQ P S PRGSM P VPR PGAARGQLDGHVQGSRL 



WO 2004/080148 



PCT/US2003/030720 



613 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=3top codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 


2052 


A 


3 

i 


1385 


KYESAQPGGTQPEPGLGARMAIHKALVMCLGLPIiFLF 
PGAWAQGHVPPGCSQGLNPLYYNLCDRSGAWGIVLEA 
VAGAGI VTT FVLTI I LVASLPFVQDTKKRSLLGTQVF 

tit T HTT r*T DOT \7T? A P^TT? V"DnT7CT'/^2i ODD I?T.T?n\7T.T? AT 

CFSCLAAHVFALNFIoARKNHGPRGWVI FTVALLLTLV 
EVI INTEWLI ITLVRGSGEGGPQGNSSAGWAVAS PCA 
I A1JMDFVMALI YVMLLLI^AFLGAWPALCGRYKRWRK 
HGVFVLLTTATS VAI WWWI VMYTYGNKQHNS PTWDD 
ryvi atat a AT\iAWAT?\rTJ?vVTPRVGnVTKGG.pFnQYOG 
DMYPTRGVG YETI LKEQKGQSMFVENKAFSMDE PVAA 
KRPVSPYSGYNGQLLTSVYQPTEMALMHKVPSEGAYD 
1 1 LPRATANSQVMGSANSTLRAEDMYS AQSHQAATPP 
KDGKNSQVFRNPYVWD 


2053 


A 


2 


555 


LMKNPDKAVPIPEKMSEWAPRPPPEFVRDVMGSSAGA 
GS GE FHVYRHLRRRE YQRQD YMDAMAEKQKJjDAEFQK 
RLEKNKIAAEEQTAKRRKKRQKLKEKKLLAKKMKLEQ 
KKQEGPGQPKEQGSSSSAEASGTEEEEEVPSFTMGR 


2054 


A 


1008 


534 


HEKMAAAWGS SLTAATQRAVTPWPRGRLLTASLGPQA 
RRRAS S S S PE AGEGQ I RLTDS CV QRLLE I TEGS EFLR 
LQVEGGGCSGFQYKFSLDTVINPDDRVFEQGGARWV 
DSDS LAFVKGAQVDFSQEL IRS S FQVLNNPQAQQGCS 


2055 


A 


1492 


528 


THWMTGMCYAPHQVLSYINGVTTSKPGVSLVYSMPS 
RNLSLRLEGLQEKDSGPYSCSVNVQDKQGKSRGHSIK 
TLELNVLVPPAPPSCRLQGVPHVGANVTLSCQSPRSK 
PAVQYQWDRQLPS FQTFFAPALDVI RGSLSLTNLSS S 
m a n\rv\Tn va tnsTT7 T 7f^T aopxhttt .JTVGTnPfiAAWAGAV 

[*Lft\jV x V V^J\xvrtlNCi VljXx\yL.lM V X UXj V O x\j xr\xc\rx v vnvjn v 

VGTLVGIiGLLAGIiVLLYHRRGKALEEPANDI KEDAI A 
PRTLPWPKS SDTI SKNGTLSS VTS ARALRPPHGPPRP 
Cl AT ,TPT PSIjS SOAIjPS PRLPTTDGAHPOPI S P I PGGV 
SSSGLSRMGAVPVMVPAQSQAGSLV 


2056 


A 


820 


319 


VVEFPVLTKAATSGILSALGNFlxAQMIEKKRKKENSR 
SLDVGGPLRYAVYGFFFTGPLSHFFYFFMEHWI PPEV 
PxxAGIxRRLLLJDPJjWAPAFIaMLFFLIMNFLEGKDASA 
VAATCMPf^FWPATjPJANWRVWTPLOFINIlTx^PLKFR 
LFANIxAALFWYAYIxASLGK 


2057 


A 


520 


330 


HGCVLSLLPKPQQGFREPVHLTSTC/ PNPTPPVPP* S 
DRYLSNPTQPVPP * SDR YLSNPTPPVSP * SDRYLSNP 
TPPVPP*SDRYLSNRTPPVSP*SDRYLSNPTPPVSP 




7\ 
ri 


Z 


4*79 


DTGQKGLPGPPGPPGYGSQGIKGEQGPQGFPGPKGTM 
GHGLPGQKGEHGERGDVGKKGDKGE IGE PGS PGKQGL 
QGPKGDLGLTKEEIIKLITEICGCGPKCKETPLELVF 
VIDSSESVGPENFQI IKNFVKTMADRVALDLATARIG 
IINYSHKVEKV 


2059 


A 


503 


1051 


VFLYPFLKWWRDP*RRELPTFHWFLIiELAIFTIjIEEV 
LF YYSHRLLHHPTFYKKIHKKHHEWTAPIGVI SLYAH 
PIEHAVSNMLPVIVGPIjVMGSHLSSITMWFSIiALIIT 
TISHCGYHLPFLPSPEFHDYHHLKFNQCYGVLGVLDH 
LHGTDTMFKQTKAYERHVLLLGFTPIiSESI PDSPK 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 

sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X^Unknown, *=Stop codon, 
/-possible nucleotide deletion,=possible nucleotide 
insertion) 


2060 


A 


1 


716 


ERVGNVCSLEI SNIQKGEGGE YMCHAVNI IGEAKS FA 
NVDIMPQ/ RRKSGGTTTSR/ 1 FVDPNMDSREGEDKEL 
KIDLEVFEMPPRFIMPI CDFKI PENSDAVFKCSVIGI 
PTPEVKWYKEYMCIEPDNIKYVISEEKGSHTLKIRNV 
CLSDSATYRCRAVNCVGEAI CRGFLTMGDSEI FAVIA 
KKSKVTLSSLMEELVLKSNYTDSFFEFQVGEGPPRFI 
KGI SDC YAPI GTAAYFQCL 


2061 


A 


47 


538 


RVRLRPVFCVMTSQEKTEEYPFADI FDEDETERNFLL 
S KPVCF WFGKPGVGKTTLARYI TQAWKC IRVEALPI 
LEEQI AAETESGVMLQSMLI SGQS I PDELVIKLMLEK 
LNSPEVCHFGYIITEIPSLSQDAMTTLQQIELIKNL\ 
NLKPDVI INIKGVLDF 


2062 


A 


1196 


230 


RARSGLQGAVPLGPTGRSRHSLQTKLPSSPFSERPLV 
FQTPGALVSTPHGRYPPPLCPPKAAFQKVIHGKAVPS 
NPS / WPTAI VNPVRSTAGPGTLGQGSLRKGRSSMRK 
NGSLQRPLQSGIPTLWGSLRRSPT/MGPSASAVPIL 
PATGDPIjLPLSRGGGDGVQA/ s psrgs ppsrasagav 
RPGSTPRPAPSLWKTKKSPSRVSIjCQNRPHLPHHPSW 

*nqktqemaskskskp*dfritallppnitppippp/ 
akpeqpatlkasqpeaaslgpemtvlfahrsgchsgq 
qtdlrrksalgkattlvstasgtqtvfpsk 


2063 


A 


1196 


230 


rarsglqgavplgptgrsrhslqtklpsspfserplv 
fqtpgalvstphgryppplcppkaafqkvihgkavps 
nps/wptaivnpvrstagpgtlgqgslrkgrssmrk 
ngslqrplqsgiptlvvgslrrspt/mgpsasavpil 

PATGDPLLPLSRGGGDGVQA/ SPSRGSPPSRASAGAV 
RPGSTPRPAPSLWKTKKS PSRVSLCQNRPHLPHHPSW 
*NQKTQEMASKSKSKP*DFRITALLPPNITPPIPPP/ 
AKPEQPATLKASQPEAASLGPEMTVLFAHRSGCHSGQ 
QTDLRRKSALGKATTLVSTASGTQTVFPSK 


2064 


A 


1554 


1358 


E FVMRHKGAKHLRS AAHDLTWFQHYS I DVIGFLLTCV 
ATAI FLFTKCFLFSCQKFNKTRKIEKRE 


2065 


A 


793 


279 


HEGASLGVRGGGMADTVLFEFLHTEMVAELWAHDPDP 
GPGGQKMSLSVLEGMGFRVGQALGERLPRETLAFREE 
LDVLKFLCKDLWVAVFQKQMDSLRTNHQGTYVLQDNS 
F PLLLPMAS GLQYLEEAPKFLAFTCGLLRGALYTLGI 
ESWTASVAALPVCKFQWI PKS 


2066 


A 


729 


487 


I IFIYLFIFLRWSL/GSVAQAEVQWPHLNSLQAPPPG 
FAPFSCLRLPSSWDYRHLPPCPANFLYFWWRRGFTML 
ARMVLI S * PRDPPAS ASQGAGI AGMSHCARP *MNYFY 
LFI YFFEME SRSVAQAEVQWPHLNSLQAPPPGFAPF S 
CLRLPSSWDYRHLPPCPANFLYFWWRRGFTMLARMVL 

IS 


2067 


A 


1 


692 


PGGNRSSSSSCRRCICTFCTCRSRRRRRSHQPRRSSW 
GPLQAEVRLEFPSEKRRGSGTRGGRGGSTGVASVGSS 
TWGGT PGLGQTGTWQG/ HTGQRGPQL PPHP\ RNS FS S 
RHRGS SG\ RLSQA\ LPE PRGLE SGKTGS ARGVAAGRH 
QEGEAATGGGPRDIAQQGGCRGSACGRRSHEALRPRV 
WCGEGPQWTW\CAVCPHRSAPGAGLAD\RQHPGESRA 
WGETRLGEAGGAE 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=^Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 


2068 


A 


114 


1031 


MPLLTLYLLLFWLSGYSIATQITGPTTVNGLERGSLT 
VQCVYRSGWETYLKWWCRGAIWRDCKILVKTSGSEQE 
VKRDRVS I KDNQKNRTFTVTMEDLMKTDADTYWCGIE 
KTGNDLGVTVQVT IDPASTPAPTTPTSTTFTAPVTQE 
ETSSSPTLTGHHLDNRHKLLKLSVLLPLIFTILLLLL 
VAASLLAWRMMKYQQKAAGMS PEQVLQPLEGDLCYAD 
LTLQLAGTS PQKATTKLS SAQVDQVEVEYVTMASLPK 
EDISYASLTLGAEDQEPTYCNMGHLSSHLPGRGPEEP 


2069 


A 


114 


1031 


MPLLTLYLLLFWLSGYSIATQITGPTTVNGLERGSLT 
VQCVYRSGWETYLKWWCRGAIWRDCKILVKTSGSEQE 
VKRDRVS I KDNQKNRTFTVTMEDLMKTD ADTYWCGI E 
KTGNDLGVTVQVTI DPAST PAPTTPTSTTFTAPVTQE 
ETSSSPTLTGHHLDNRHKLLKLSVLLPLIFTILLLLL 
VAASLLAWRMMKYQQKAAGMS PEQVLQPLEGDLCYAD 
LTLQLAGTS PQKATTKLS SAQVDQVEVEYVTMASLPK 
EDISYASLTLGAEDQEPTYCNMGHLSSHLPGRGPEEP 

fDVOTT CDD* 
lEiXO 1 XoKf w 


2070 


A 

1 


114 


1031 


MPLLTLYLLLFWLSGYSIATQITGPTTVNGLERGSLT 
VQCVYRSGWETYLKWWCRGAIWRDCKILVKTSGSEQE 
VKRDRVS I KDNQKNRTFTVTMEDLMKTDADTYWCGI E 
KTGNDLGVTVQVT I DPAS TPAPTTPTSTTFTAPVTQE 
ETSS SPTLTGHHLDNRHKLLKLSVLLPLI FTI LLLLL 
VAASLLAWRMMKYQQKAAGMS PEQVLQPLEGDLCYAD 
LTLQLAGTS PQKATTKLS SAQVDQVEVEYVTMASLPK 
EDI S YASLTLGAEDQEPTYCNMGHLSSHLPGRGPEEP 

TEYSTISRP* 


2071 


A 


51 


1464 


ALPGEFFFRFHPAHKHCHLLPPSLFTNVTTQSEISSF 
LS FLHFQQVPLRQKPRRKTQGFLTMSRRRI S CKDLGH 
ADCQGWLYKKKEKGSFLSNKWKKFWVILKGSSLYWYS 
wnMiuifann pumt .PTTPTVERASECKKKHAFKI SHPQI 
KTFYFAAENVQEMNVWLNKLGSAVIHQESTTKDEECY 
SESEQEDPEIAAETPPPPHASQTQSLTAQQASSSSPS 
LSGTSYSFSSLENTVKTPSSFPSSLSKERQSLPDTVN 
cit . <-5 A APDEGOPITFAVOVHS P VPS EAGI HKALENS FV 
TSESGFLNSLSSDDTSSLSSNHDHLTVPDKPAGSKIM 
DKEETKVS EDDEMEKLYKSLEQASLS PLGDRRPSTKK 
ELRKSFVKRCKNPS INEKLHKIRTLNSTLKCKEHDLA 
M TNOT J .nnpKLTARKYRE WKVMNTLLI QDI YQQQRAS 
PAPDDTDDTPQELKKSPSSPSVENSI 


9079 


7\ 
t\ 


87 


477 


IKS KLNQQVEVQE S E WRLTEAKGP TMGKE S GWD S GRA 
AVAAWGGWAVGTVLVALS AMGFTSVGI AASS I AAK 
MMSTAAI ANGGGVAAGSLVAI LQS VGAAGLSVTS KVI 
GGFAGTALGAWLGS PPS S 


2073 


A 


87 


477 


I KS KLNQQVEVQES EWRLTEAKGPTMGKE SGWD SGRA 
AVAAWGGWAVGTVLVALSAMGFTSVGI AASS I AAK 
MMSTAAI ANGGGVAAGSLVAI LQS VGAAGLSVTS KVI 
GGFAGTALGAWLGS PPSS 


2074 


A 


112 


483 


AGVGALRMVQRLTYRRRLSYNTASNKTRLSRTPGNRI 
VYLYTKKVGKAPKS ACGVC PGRLRGVRAVRPKVLMRL 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 

nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 

AiiHino 
tuuiug 

nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X==Unknown, **=Stop codon, 
/^possible nucleotide deletion,=possibIe nucleotide 
insertion) 










SKTKKHVSRAYGGSMCAKCVRDRI KRAFLI EEQKI W 
KVLKAQAQSQKAK 


2075 


A 


2 


446 


FQNMTCELHLTCSVEDADDNVSFRWEALGNTLSSQPN 
LTVSWDPRISSEQDYTCIAENAVSNLSFSVSAQKLCE 
DVKIQYTDTKMILFMVSGICI VFGFI ILLLLVLRKRR 


2076 


A 


1208 


249 


VGWSVHRYVLLHHVMGGLEGMQGAWGYVQGGMGALSD 
AI AS S ATTHGAS I FTEKTVAKVQVNSEGCVQGVVLED 
GTEVRSKMVLSNTSPQITFLKLTPQEWLPEEFLERIS . 
nT.nTPc-pTTTK'TM /V* RAHHT AALSPLTHLSEKPPGWG 
Q/ HELSHHLH/ CPDLQPVS PCSLVRSGRRQAAQ/ PSW 
RPPMLPGASRCPITNAPST*TVKTPSSFIRPLKMPWM 
ACLPTVFDCIEVYAPGFKDS WGRDILTPPDLERI FG 
LPGGNI FHCAMSLDQLYFARPVPLHSGYRCPLQGLYL 
CGSGAHPGGGVMGAAGRNAAHVAFRDLKSM 


2077 


A 


38 


376 


MALGVPI S VYLLFNAMTALTEEAAVTVTP PI TAQQGN 
WTVNKTEADNIEGPIALKFSHLCLEDHNSYCINGACA 
FHHELE KAI CRCFTGYTGERCLKLKS P YNVCSGERRP 
L* 


2078 


A 


38 


376 


MALGVPI S VYLLFNAMTALTEEAAVTVTPPI TAQQGN 
WTVNKTEADNIEGPIALKFSHLCLEDHNSYCINGACA 
FHHELE KAI CRCFTGYTGERCLKLKS PYNVCSGERRP 
L* 


2079 


A 


38 


376 


MALGVPISVYLLFNAMTAIjTEEAAVTVTPPITAQQGN 
WTVNKTEADNIEGPIALKFSHLCLEDHNSYC INGACA 
FHHELEKAI CRCFTGYTGERCLKLKS PYNVCSGERRP 
L* 


2080 


A 


1 


675 


MAPPLRPLARLRPPGMLLRALLLLLLLSPLPGLREGI 
GELITPIGTSLPDLDPARRRWEGGIGRVGSEVADLCP 
GKEGGKVPEAEKEGVWCFSELSFVKEPQDVTVTRKDP 
VVLDCQAHGEVPIKVTWLKNGAKMSENKRIEVLSNGS 
LYISEVEGRRGEQSDEGFYQCLAMNK\F*AILNQKAH 
IiALSRIGST* RRRPDRP * EDEAFVMTTHCFQDLLTSL 
IES 


2081 


B 


1 


3147 


MAKI SASRAEKVLEHPGEREKGREMAS PWNHS I LALA 
AVWI I SMVLLGRS IQASRKEKMQPPEKETPEVLHLD 
EAKDHNSLNNLRETLLSEKPNLAQVELELKERDVLSV 
FLPDVPETES YI SWNMALPPFFGQGRPGPPPPQPPP 
LALFGCPPPPLPSPAFPPPLPQRPGPFPGASAPFLQP 
PLALQPRASAQASRGGGGAGAFYPVPPPPLPPPPPQC 
RPFPGTDAGERPRPPPPGPGPPWSPRWPEAPPPPADV 
LGDAALQRLRDRQWLEAVFGT PRRAGCPVPQRTHAGP 
SLGEVRARLLRALRLVRRLRGLSQALREAEADGAAWV 
LLYSQTAPLRAELAERLQPLTQAAYVGEARRRLERVR 
RRRLRLRERAREREAEREAEAARAVEREQEIDRWRVK 
CVQE VEEKKRFFCE I LTDELVLWE PSGRPQ PQQLQIL 
TAMS TSTF YDKE LKTARENKE EE L I DKLE WTM PS PS 
PKGLPVKQYAVQSQLPVYEWPDVGSGEYDVGWASFG 
RLLNEALILKFPYSALGGSGS PAPLTRLASPAAPQDG 
QVDLEGRALR PAARAGFSKHRGHGDALDGHAGLRPEL 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possib!e nucleotide deletion,=possible nucleotide 
insertion) 










HAPLTVVADGLFSKFRKSLVSNKVSVSSHFVGFLMKH 
DFSIiERTALFWVEAAGQGPS PYQCGDPGTASAPPAWL 
LLVSPEHGLAPAPTTIRDPEAGHQERPEEEGEDEAEA 
q qp Q T?PP P A P c; Q T .OPC!^ P A PfiPflpPT iPS L.DVLRGVR 
LELAGARRRLSEGKLVSRPRALLHGLRGHRALSLCPS 
PAQSPRSASPPGPAPQHPAAPAS PPRPSTAGAI PPLR 
SHKPTVAIYITTKRLPYFPIVNFLFLIAQLPKLQYNK 
NVALTVKFLTKRF I S E YDPNLGMVCRKPTD PVDWP PL 
VLGLLTLMKQFHSRYTEQFLALIGQFICSTVEQCTRQ 
VTKAEGVALAGRFGCLFFEVSACLDFEHVQHVFHEAV 
PT?apPT7T.T?K"<?PT»TPPTiPT RA"L PHO AP LTARHGLA 
SCTFNTLSTINLKEMPTVAQAKLVTVKSSRAQSKRKA 
PTLTLLKGFKIF 


2082 


A 


85 


839 


RSGSLMAAAAATKILLCLPLLLLLSGWSRAGRADPHS 
LCYDI TVI PKFRPGPRWCAVQGQVDEKTFLHYDCGNK 
TVTPVS PLGKKLNWTAWKAQNPVLREVVDI LTEQLR 
DIQLENYTPKEPLTLQARMSCEQKAEGHSSGSWQFSF 
DGQIFLLFDSEKRMWTTVHPGARKMKEKWENDKWAM 
SFHYFSMGDCIGWLEDFLMGMDSTLEPSAGAPLAMSS 
GTTQLRATATTLILCCLLILLPCFILPGI 


2083 


A 


1 


1742 


VSAVEFVLHGKDFQVDCKASGSPVP* ISWSLLDGTMI 
NNAMQ ADD S GHRTRR YTL FNNGTL YFNKVGVAE EGD Y 
TC YAQNTLGKDEMKVHLTVITAAPRIRQSNKTNKRI K 
AGDTAVLDCEVTGDPKPKI FWLLPSNDMI SFSI DRYT 
FHANGSLTINKVKLLDSGEYVCVARNPSGDDTKMYKL 
D WS KP PL I NGL YTNRTVI KATAVRHSKKHFDCRAEG 
TP PmTMWTMPDN I FTjTAPYYGSRI TVHKNGTLE I RN 
VRLSDSADF I CVARNEGGE SVLWQLEVLEMLRRPTF 
RNPFNEKIVAQLGKSTALNCSVDGNPPPEIIWILPNG 
TRFSNGPQS YQYLI ASNGSFI I SKTTREDAGKYRCAA 
RNKVGYIEKLVILEI GQKPVI LTYAPGTVKGI SGESL 
SLHCVSDGI PKPNI KWTMPSGYWDRPQINGKYILHD 
NGTLVI KEATAYDRGNYI CKAQNS VGHTLITVPVMI V 
AYP PRI TNRPPRS I VTRTGAAFQLHCVALGVPKPE I T 
WEMPDHSLLSTASKERTHGSEQLHLQGTLVIQNPQTS 
DSGIYKCTAKNPLGSDYAATYIQVI 


2084 


A 


1 


1742 


VSAVEFVLHGKDFQVDCKASGS PVP * I SWSLLDGTMI 
NNAMQADDSGHRTRRYTLFNNGTLYFNKVGVAEEGDY 
TCYAQNTLGKDEMKVHLTVITAAPRIRQSNKTNKRI K 
AGDTAVLDCEVTGDPKPKI FWLLPSNDMI S FS IDRYT 
FHANGSLTINKVKLLDSGEYVCVARNPSGDDTKMYKL 
DWSKP PL I NGL YTNRTVI KATAVRHSKKHFDCRAEG 
TPS PEVMWI MPDNI FLTAPYYGSRI TVHKNGTLE IRN 
VRLSDSADF I CVARNEGGE SVLWQLEVLEMLRRPTF 
RNPFNEKIVAQLGKSTALNCSVDGNPPPEIIWILPNG 
TRFSNGPQS YQYLI ASNGSFI I SKTTREDAGKYRCAA 
RNKVGYI EKLVI LEIGQKPVI LTYAPGTVKGI SGESL 
SLHCVSDGI PKPNI KWTMPSGYWDRPQINGKYI LHD 
NGTLVI KEATAYDRGNYI CKAQNS VGHTLITVPVMI V 
AYPPRITNRPPRS I VTRTGAAFQLHCVALGVPKPE IT 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










WEMPDHSLLSTAS KERTHGS EQLHLQGTLVI QNPQTS 
DS GI YKCTAKNPLGSD YAATYI QVI 


2085 


A 


1 


1742 


VSAVEFVLHGKDFQVDCKASGS PVP * I SWSLLDGTMI 
NNAMQADDSGHRTRRYTLFNNGTLYFNKVGVAEEGDY 
TCYAQNTLGKDEMKVHLTVITAAPRIRQSNKTNKRIK 
AGDTAVUDCEVTGDPKPKI FWLLPSNDMI S FS I DRYT 
FHANGSLTINKVKLLDSGEYVCVARNPSGDDTKMYKL 
DWSKPPLINGLYTNRTVIKATAVRHSKKHFDCRAEG 
TP q PF VMWTMPDNTFIiTAPYYGSRI TVHKNGTLE IRN 
VRLSDS ADFI CVARNEGGES VLWQLEVLEMLRRPTF 
RKTPFNEKIVAQLGKSTALNCSVDGNPPPEIIWILPNG 
TRFSNGPQS YQYLI ASNGS FI I SKTTREDAGKYRCAA 
RNKVGYIEKLVI LEIGQKPVILTYAPGTVKGI SGESIi 
SLHCVSDGIPKPNIKWTMPSGYWDRPQINGKYILHD 
NGTLVIKEATAYDRGNYICKAQNSVGHTLITVPVMIV 
AYPPRI TNRPPRS I VTRTGAAFQLHCVAIiGVPKPE IT 
WEMPDHSLL STAS KERTHGS EQLHLQGTLVI QNPQTS 
DSGIYKCTAKNPLGSDYAATYIQVI 


2086 


A 


180 


275 


MEEPQSDPSVEPPLSQETFSDLWKLLSENNVL 


1 A O ""7 






■LJ.fi / 


MA<3MAAVTiTWATiALLSAFSATOARKGFWDYFSQTSGD 
KGRVEQI HQQKMARE PATLKDSLEQDLNNMNKFLEKL 
RPLSGSEAPRLPQDPVGMRRQLQEELEEVKARLQPYM 
AEAHELVGWNLEGLRQQLKPYTMDLMEQVALRVQELQ 
EQLRWGEDTKAQLLGGVDEAWALLQGLQSRWHHTG 
RFKELFHP YAE S LVSGI GRHVQELHRS VAPHAPAS PA 
RLSRCVQVLSRKLTLKAKALHARIQQNLDQLREELSR 
AFAGTGTEEGAGPDPQMLSEEVRQRLQAFRQDTYLQI 
AAFTRAIDQETEEVQQQLAPPPPGHSAFAPEFQQTDS 
GKVLS KLQARLDDLWED I THS LHDQGHSHLGDP * 






4 7 


1 1 4.7 


MASMAAVLTWALALLSAFSATQARKGFWDYFSQTSGD 

KGRVEQIHQQKMAREPATljKDSLEQDLNl^ 

RPLSGSEAPRLPQDPVGMRRQLQEELEEVKARLQPYM 

AEAHELVGWNLEGLRQQLKPYTMDLMEQVALRVQELQ 

EQLRWGEDTKAQLLGGVDEAWALLQGLQSRWHHTG 

RFKELFHPYAESLVSGI GRHVQELHRS VAPHAPAS PA 

RLSRCVQVLSRKLTLKAKALHARIQQNLDQLREELSR 

AFAGTGTEEGAGPDPQMLSEEVRQRLQAFRQDTYLQI 

AAFTPJ^TDOETEEVOOOLAPPPPGHSAFAPEFOOTDS 

GKVLS KLQARLDDLWEDI THS LHDQGHSHLGDP * 


2089 


A 


1199 


329 


DFGEFMRENRLTPFLDPRYKIDGSLEVPLERAKDQLE 
KHTRYWPMI I SQTTI FNMQAWPLASVI VKESLTEED 
VLNCQKTIYNLVD^RKNDPLPISTVGTRGKGPKRDE 
QYRIMWNELETLVRAHINNSEKHQRVLECLMACRSKP 
PEEEERKKRGRKREDKEDKSEKAVKDYEQEKSWQDSE 
RLKGI LERGKEELAEAEI I KDS PDS PEPPNKKPLVEM 
DETPQVEKSKGPVSLLSLWSNRINTANSRKHQEFAGR 
LNSVNNRAELYQHLKEENGMETTENGKASRQ 


2090 


A 


3 


456 


RWNS IMELALLCGLWMAGVI PI QGGI LNLNKMVKQV 
TGKMPILSYWPYGCHCGLGGRGQPKDATDWCCQTHDC 
CYDHLKTQGCGIYKDYYRYNFSQGNIHCSDKGSWCEQ 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
enuing 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/—possible nucleotide deletion,=possible nucleotide 
insertion) 










QLCACDKEVAFCIiKRNLDTYQKRLRFYWRPHCRGQTP 
GC 


2091 


A 


27 


489 


EGEE PRD * PKMPLTPE PP / VWARGGAPRMGS S PMALT 
ALWALHPHHAGPGHPGCALHPHHRCVG* TPVPPCS PP 
RPQPPCTHPGVAPRRRAVD*AHGHRPRAL*GLWLCG 
PPADRSGP*ASHPATWAPRPYWRSQPGAPSGGPSPGR 
GGPPPQA 


2092 


A 


2022 


617 


VI PPVLTARGPRPRGAGAMVRGRI SRLSVRDVRFPTS 
LGGHGADAMHTDPDYS AAYWI ETDAEDGI KGCGITF 
TLGKGTEVVVCAVNALAHHVLNKDLKDIVGDFRGFYR 

r\T TCTVOT D TaT T f "DTP VT*\ T\ TUT A 7AT7T ."M7A\7WF)T .W7AT^O*R 
QUI bUC^ljKWXoFiil\\jrV VriliAl AAV J_uNA V WUJjWAXV.yn» 

GKPVWKLLVDMDPRMLVSCIDFRYITDVLTEEDALEI i 

KQLCAQALKDGWTRFKVKVGADLQDDMRRCQI IRDMI 
GPEKTI1MMDANQRWDVPEAVEWMSKI1AKFKPLWIEEP 
TS P * LTFLGHATI \ SKALVPFRELGI CTRENS CHNRV 
IFKQLLQAKALQFLQIDSCRLGSYNENLSVLLMAKKF 
EI PVC PHAGGVGLCELVQHL I I FDYISVSASLENRVC 
EYVDHLHEHFKYPVMIQRASYMPPKDPGYSTE\LKEE 
SCKRNTQYPQMGEVWEETPFPAQEN 


2093 


A 


63 


193 


SGRLAPHTSRRTSANCSDDAKSSDSCSPSRKT*WSGR 
NTNRIH 


2094 


A 


1404 


142 


I PGSTISWS PAAARGLS VCRC CRLHPASAMDLFGDLP 
EPERSPRPAAGKEAQKGPLLFDDLPPASSTDSGSGGP 
t t DnnT DDacQr^nQf^QT.ATQT^nMVKTKnKGAKRKTS 

1 M i H I Jj J I i Y 1 f^ 1 M :*t I -r 1 J ri I -to I t/A l. ^> J. O^l l V IV A uvjivuruuviv o. *j 

EEEKNGSEELVEKKVCKASSVI FGLKGYVAERKGERE 
EMQDAHVILNDITEECRPPSSLITRVSYFAVFDGHGG 
I RASKFAAQNLHQNLI RKF PKGDVT SVEKTVKRCLLD 
TFKHTDEEFLKQASSQKPAWKDGSTATCVLAVDNILY 
TaMT nnQpaTT.rPYMPR^nKHAAIjSLSKEHNPTOYEE 
RMRI QKAGGNVRDGRVLGVLE VSRS I GDGQ YKRCGVT 
SVPDIRRCQLTPNDRFILLACDGLFKVFTPEEAVNFI 
L S CLEDEKI QTREGKS AADAR YEAACNRLANKAVQRG 
SADNVTVMVVRIGH 


2095 


A 


2 


541 


FVGHCVNTEGGFVCERGPGMRVSADRHS CQDTDECLG 
TPCQQRCKNSIGSYKCSCRTGFHLHGNRHSCV/DYTP 
RI PLC S PI FLAAFAPLDVNECRRPLERRVCHHS CHNT 
GGS FLCTCRPGFRLRADRVS CE / DF PE SRAGPI CHPA 
TPVTPVQE/CYCCLLRPHGLPCAQDIDLLIjGLQGHQ 


2096 


A 


1206 


2266 


nzir T rpT DUITT V"T VV^TKTIf THT7tf TTKT? VTOT J .VFPTiFT iP 
RHltljl J.r MlvLilvi. x xvx J.JN xvJ-jJr i\jvx\iv v iyuiivrv-uruv< 

LFFSSEIWKNQTlWTEFLLLGFlJljGPRIQMIiLFGLFS 

LFYVFTLLGNGTILGLISXxDSRIiHTPI^FFLSHLAVV 

NI AYACNTVPQMLVNLLHPAKPI SFAGCMT* TFLFLS 

FAHTECLLLVLMS YDRYVAI CHPLRYFI IMTWKVCIT 

LAITSWTCGSLLAMVHVSLIIiRLPFCGPREINHFFCE 

ILSVIxRIACADTWLNQWIFAACMFIIiVGPLCLVLVS \ 

YSHILAAILRIQSGEGRRKAFSTCSSHLCWGLFFGS 

AIVMYMAPKSRHPEEQQKVLFLFYSSFNPMLNPLIYN 

LRNVEVKGALRRALCKESHS 


2097 


A 


1206 


2266 


RHLLTIFHKLKIYKTINKIDFKKKRVTQLLVFCLFLC 



WO 2004/080148 



PCT/US2003/030720 



620 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X^Un known, *=Stop codon, 
/^possible nucleotide deletion,=possibIe nucleotide 
insertion) 










LFFSSEMVKNQTMVTEFLLLGFLLGPRIQMLLFGLFS 
LFYVFTLLGNGTILGLI SLDSRLHTPMYFFLSHLAW 
NI AYACNTVPQMLVNLLHPAKPI SFAGCMT*TFLFLS 
FAHTECLLLVLMS YDRYVAI CHPLRYFI IMTWKVCIT 
LAI TSWTCGSLLAMVHVSLI LRLPFCGPRE INHFFCE 
ILSVLRLACADTWLNQWI FAACMF I LVGPLCLVLVS 
YSHILAAILRIQSGEGRRKAFSTCSSHLCWGLFFGS 
AIVMYMAPKSRHPEEQQKVLFLFYSSFNPMLNPLIYN 
LRNVE VKGALRRALCKE S HS 


2098 


A 


276 


243 ■ 


EKWPD*SRAACPVLCRGNGQYSKGRCLCFSGWKGTEC 
DVPTTQC I D PQCGGRGI C IMGS CACNSGYKGES CEE A 
PRYIPEKE 


2099 


A 


4 


770 


RETGSVSLS PSGLEGAES YAVSPI LYSSPDVKELWLE 
TLQGQRHSHTGVKSTPGQSAAILMKLRSSHNASKTLN 
ANNMETLIECQSEGDIKEHPLLASCESEDSICQLIEV 
KKRKKVLSWPFLMRRLSPASDFSGALETDLKASLFDQ 
PLSIICGDSDTLPRPIQDILTILCLKGPSTEGIFRRA 
ANEKARKELKEELNSGDAVDLERLPVHLLAWFKDFL 
RSI PRKLLS SDLFEEWMGALEMQDEEDRI EALK 


2100 


A 


901 


521 


FFFGNGVSPCRQAGV*WHDLDSLQNLPPGFKRFSYLS 
LPS S W\ DYRHVLPRQANF C I F /M * RRGFTMLARMVS I 
S * PRDLPALASQSAGITGVSHHAPPQMDFTFALLCFA 
LKGCLPRQKEGGTLNLI 


2101 


A 


901 


521 


FFFGNGVS PCRQAGV* WHDLDSLQNLPPGFKRF S YLS 
LPS SW\DYRHVLPRQANFCI F /M * RRGFTMLARMVS I 
S* PRDLPALASQSAGITGVSHHAPPQMDFTFALLCFA 
LKGCLPRQKEGGTLNLI 


2102 


A 


3 


600 


PRCRNSARVADTFYTNAGCTLVALNPFKPVPQLYS PE 
LMRE YHAAPQPQKLKPHVFTVGEQTYRNVKSLI EPVN 
QS I WSGE SGAGKTWTSRCLMKFYAWATS PAS WESH 
KI AERI EQRI LNSNPVMEAFGNACTLRNNNSSRFGKF 
IQLQLNRAQQMTGAAVQTYLLEKTRVACQASSERNKD 
PI PPELTRLLQQSQ 


2103 


A 


3 


600 


PRCRNS ARVADT F YTNAGCTLVALNPFKP VPQL YS PE 
LMREYHAAPQPQKLKPHVFTVGEQTYRNVKSLI EPVN 
QSIWSGE SGAGKTWTSRCLMKFYAWATS PAS WESH 
KIAERIEQRILNSNPVMEAFGNACTLRNNNSSRFGKF 
IQLQLNRAQQMTGAAVQTYLLEKTRVACQASSERNKD 
PI PPELTRLLQQSQ 


2104 


A 


10 


435 


FKWLLKSHAI CFWTRS * S YCDNVCVPSLWAHHLGIRT 
EI PEFFLSKFLCTS 1 1 PHFTYRRQLRLIQGSTE *EA* 
EDKLEQK*ALGAAQFTLPGMDVFVCFVFCF/ CLFEME 
SHSVT*ARVQWCDLGSLQPLPLGFKQFSCLGL 


2105 


A 


79 


1222 


CQRREDAAE FWLCFALDPSKDPCLKVKCSPHKVCVTQ 
DYQTALCVSRKHLLPRQKKGNVAQKHWVGPSNLVKCK 
PCPVAQSAMVCGSDGHS YTS KCKLEFHACSTGKSLAT 
LCDG\PCPCLPEP\EPPKHKGRKGVPCTDKELRNLAS 
RLKDWFGALHEDANRVI KPTSSNTAQGRFDTSI LPI C 
KDSLGWMLNKLDMNYDLLLDPSEINAIYLDKYEPCIK 
PLFNSCDSFKDGKPFLNNEWCLLPSQNPGGLP/CAQN 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possibIe nucleotide 
insertion) 










EMNRIQ\ kls kgksllgafi prcmeegyykatqchgs 

TGQCWCVDKYGNELAGSRKQGAVSCEEEQETSGDFGS 
GGSWLLDDLEYERELGPKDKEGKLRVHTRAVTEDDE 
DEDDDKEDEVGYIW 


2106 


A 


174 


857 


MLNIiAFTVGSFLLSAITLPLGIVMDKYGPRKIiRLLGS 
ACFAVSCLLIAYGASKPNALSVLIFIALALNGFGOyiC 
MTFTSLTLPNMFGDLRFTFIALMIGSYASSAVTFPGI 
KLIYDAGVSFIWLWWAGCSGLVFLNCFFNWPLEPF 
PGPEDMDYSVKI KFSWLGFDHKITGKQFYKQVTTVGR 
RLSVGSSMRSAKEQVALQEGHKLCLSTVDRNSXRSXA 
LVSGYP 


2107 


A 


174 


857 


MLNIiAFTVGSFLLSAITLPIiGIVMDKYGPRKLRLLGS 
ACFAVSCLLI AYGASKPNALS VLI FI ALALNGFGGMC 
MTFTSLTLPNMFGDLRFTFIALMIGSYASSAVTFPGI 
KL I YDAGVS FI WL WWAGCSGL VFLNCF FN WPLE P F 
PGPEDMDYSVKI KFSWLGFDHKITGKQFYKQVTTVGR 
RLSVGSSMRSAKEQVALQEGHKLCLSTVDRNSXRSXA 
LVSGYP 


2108 


A 


1 


570 


YAAFGAWTRVSLPAPRCPALGGLASGPGESGPALLQ 
VCGAKC PGGAPRGENREKEETTRI GPGVME S KE KRAV 
NSLSMENANQENEEKEQVANKGEPLALPLDAGEYCVP 
RGNRRRFRVRQP I LQYRWDMMHRLGE PQARMREENME 
RIGEEVRQLMEKLREKQLSHSLRAVSTDPPHHDHHDE 
FCLMP 


2109 


A 


70 


993 


SEQKIQEQGYVWITVFSALPTTVSALHPRVLKPLSSL 
I HLQANSNPWECNCKLLGLRDVJfliAS SAI TLNI YWQNP 
PSMRGRALRYINITNCVTSSINVSRAWAWKSPHIHH 
KTTALMMAWHKVTTNGSPLENTETENITFWERI PTSP 
AGRFFQENAFGNPLETTAVLPVQ I QLTTS VTLNLE KN 
SALPNDAASMSGKTSLICTQEVEKLNEAFDILLAFFI 
LACVLI I FLI YKWQFKQKLKASENSRENRLE Y YS FY 
QSARYNVTASICNTSPNSLESPGLEQIRLHKQIVPEN 
EAQVILFEHSAL 


2110 


C 


160 


297 


MILCHLMQAPYHLKVSWEPTDPPTLWKCWTNVSTNPP 
LSALRGHR 


2111 


A 


2 


951 


PRVRPRVRPRVRS SRPRSRDPS PRRARLRWQLRWKPR 
WCPRPPKTPGVWKRPRTRPRS S AGGSTGFPSS PI LRR 
SPSTRRRS SRKAS PTATRATGTPPRQAQRKTARAAGR 
RRASPGIATAGTRSMISM\RPGRKPSNPSWEGRTNEE 
TSSLSRLKPVSPGTITCPLRTPGSLLKDSKIPISIKH 

RGLAGTTIRATACHDSAQKVVRSSRPRWMGPMPRNTT 
FPWETTKVSFAFPKESLL/WTPPVPRPAPERGPRRSL 
CPE *GPDNTRKRDATRGFLLSR 


2112 


A 


82 


435 


MLVLLPRSKAMPLLSVNVTLAFFPRNKEIVKYLLNQG 
ADVTLRAKNGYTAFDLVMLLNDPDIFGGELIGFLSW 
TELVRLLASVFMQVNKDIGRRSHQLPLPHSKVPTALE 
HPSAAR* 


2113 


A 


83 


1138 


PRRMGSWVQLITSVGVQQNHPGWTVAGQFQEKKRFTE 
EVI EYFQKKVS PVHLKI LLTSDEAWKRF VRVAELPRE 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










EADALYEALKNLTPWAIEDKDMQQKEQQFREWFLKE 
F PQIRWKIQES I ERLRVT ANE I EKVHRGCVI ANWSG 
STGILSVIGVMLAPFTAGLSLSITAAGVGLGIASATA 
G I AS S I VENTYTRS AELTASRLTATS TDQLEALRD I L 
HD I TPNVLS F ALD FDE ATKMI AND VHTLRRS KATVGR 
PLIAWRYVPINWETLRTRGAPTRIVRKVARNLGKAT 
SGVLVVLDVVNLVQDSLDLHKGEKSESAELLRQWAQE 
LEENLNELTHIHQSLKAG 


2114 


A 


83 


1138 


PRRMGS WVQLI I b VCjfVUUNrlirijW J. VA^y J? UiiJ\JvKr 1 a 
EVI EYFQKKVS PVHLKI LLTSDEAWKRFVRVAELPRE 
EADALYEALKNLTP.YVAI EDKDMQQKEQQFRE WFLKE 
F PQ I RWKIQES I ERLRVI ANE I E KVHRGCVI ANWSG 

tj ItaXlio VXvjVMxxH.r'r l/iuLbliax XAiivj VVjriAsX/\oi\J.A 

GIASSIVENTYTRSAELTASRLTATSTDQLEAIxRDIL 
HD I T PNVLS FALD FDEATKM I AND VHTLRRS KATVGR 
PLIAWRYVPINWETLRTRGAPTRIVRKVARNLGKAT 

c; nvT ,wt .n wmt •von °. t »dt ,hkge ks e s aellro waoe 

LEENLNELTHIHQSLKAG 


2115 


A 


700 


283 


VPRLVSPLSNPAPKFYCVSFFYHMYGKHIGSLNLLVR 
SRNKGALDTHAWSLS GNKGNWQQAHVPI S PSG P FQ I 
T T7J?nVPnP<TY7 flD TAT DDVTT iKKfiEr PRKOTDPNKVV 
VMPGSGAPCQSS PQLWGPMAI FLLALQR 


2116 


A 


700 


283 


VPRLVS PLSNPAPKF YCVS FF YHMYGKHIGSLNLLVR 
SRNKGALDTHAWSLSGNKGNVWQQAHVPI SPSGPFQI 
I FEGVRGPGYLGDIAIDDVTLKKGECPRKQTDPNKW 
VMPGSGAPCOSS POLWGPMAI FLLALOR 


2117 


A 


554 


970 


MVLPFICNLLPJRJ1PACRVLVHRPHGPELDADPYDPGE 
EDPAQSRALESSLWELQALQRHYHPEVSKAASVINQA 
LSMPE VS I APLLELT AYE I FERDLKKKGPE PVPTGVL 
S 0 PRACWDGR VKLCAQH FHAQLTLAHL * 


2118 


A 


1 


b4X 


A Q Q \CMC2 & T . T PP. T ,H YYTDP T /TVR R R ^ GHS VS L I D 

LWGLLVE YLLYQEEN PAKLSDQQEAVRQGQNPYPI YT 
SVNVRTNLSGEDFAEWCEFTPYEVGFPKYGAYVPTEL 
FGSELFMGRLLQLQ PE PRI C YLQGMWGS AF ATS LDE I 
FLKTAGSGLSFLEWYRGS VNI TDDCQKPQLHN 


2119 


A 


1 


541 


VHVCSSKMGALSTERLQYYTQELGVRERSGHSVSLID 
LWGLLVE YLLYQEENPAKLSDQQEAVRQGQNPYP I YT 
SVNVRTNLSGEDFAEWCEFTPYEVGFPKYGAYVPTEL 
FGS ELFMGRLLQLQPEPRI CYLQGMWGS AFATSLDEI 
FLKTAGSGLS FLEWYRGS VNI TDDCQKPQLHN 


Ol OA 


TV 
f\ 


X 


-L -J c* *± 


PHPSGPRITHSHARETACOP/GSEOHPGPHGGQLPRG 
GRQGPELPSHVCRAQA\GRTGQEPS SERPHAGQGAGL 
WSGS PWGRGRTQPTHAPTEGATPRC PLRPS PRGSGRA 
GPTLIRAGLSGGRGGRSLCPCGFPRAGAVPARS SHNQ 
TS PVHEKSRH/ GPTASGPGCWWLGDPQGRRVPGLAVP 
*APAAGTPMDKLPGLHLPEQRLPSIGGPFSAGLSPSG 
QSREWQGGSQGSRSRQFSKKAPGPPPS\TGGGCLGCG 
GRGT\ RGS AHAG\ PWGS PHQQGS * GAPGSQAKGGTP * 
RKPAPANGSSEEQEEARGPQGLEVSSSQTSASHAGLG 
LQGNSTRGVGPGPRPPAEPTTGRS WARSRVNPD* EQA 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location oi 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possib!e nucleotide 
insertion) 










SGA*VRSGSRSPGDALESSCNAPAWLQLCSAPCALGS 
REPGQGLAVTQTLCGPQSLGHPRESHKTRPRYEAATS 
SACLGLALTGTFSVEETEMFMTRQRPTGRDLQRGTRP 
QGWQGPVPGTSHYGRARPALGEASDKQEANGA 


2121 


A ! 


233 


692 


DNHPSFPRLPSSRPGTKEVLKEIHISDTTADVIFYPI 
YRMS EMI FRRI KMPWLWLDLWYLMFKEGWEHKKS LKI 
LHTFTNSVIAERANEMNANEDCRGDGRGSAPSKNKRR 
AFLDLLLSVTDDEGNRLSHEDIREEVDTFMFEVLYIV 
RFRYH 


2122 


A 


2 


1115 


PRVRSSGGQEDPASQQWARPRFTQPSKMRRRVIARPV 

GSSVKLKUVAbvjHFKirDl J. WW ISJJXJyALi ± KFoAftJi irKJV 

KKWTLSLKNLRPEDSGKYTCRVSNRAGAINATYKVDV 
IQRTRSKPVLTGTHPWTTVDFGGTTSFQCKVRSDVK 
PVIQWLKRVE YGAEGRHNSTI DVGGQKFWLPTGDVW 
SRPDGSYLNKLLITRARQDDAGMYICLGANTMGYSFR 
SAFLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAG 
AVFILGTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTA 
RDRSGDKDIiPSLAALSAGPGVGLCEEHGSPAAPQHLL 
GPGPVAGPKLYPKLYTGHSTPHTYTHPPPSCQLNSSH 
S 


2123 


A 


2 


1115 


PRVRSSGGQEDPASQQWARPRFTQPSKMRRRVIARPV 

KKWTLSLKNLRPEDSGKYTCRVSNRAGAINATYKVDV ! 
IQRTRSKPVLTGTHPWTTVDFGGTTSFQCKVRSDVK 
PVIQWLKRVE YGAEGRHNSTI DVGGQKFWLPTGDVW 
SRPDGS YLNKLLI TRARQDDAGMYI CLGANTMGYS FR 
SAFLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAG 
AVFILGTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTA 
RDRSGDKDLPSLAALS AGPGVGLCEEHGS PAAPQHLL 
GPGPVAGPKLYPKLYTGHSTPHTYTHPPPSCQLNSSH 
S 


2124 


A 


2 


1115 


PRVRSSGGQEDPASQQWARPRFTQPSKMRRRVIARPV 

bob VKJjA.Vrf V Aortic l^-rL/X JL W l*i rdJLJ<^±\±i X S\trCit\r\szi c i\.xv 

KKWTLSLKNLRPEDSGKYTCRVSNRAGAINATYKVDV 
IQRTRSKPVLTGTHPVNTTVDFGGTTS FQCKVRSDVK 
PVIQWLKRVE YGAEGRHNSTI DVGGQKFWLPTGDVW 
SRPDGS YLNKLLI TRARQDDAGMYI CLGANTMGYSFR 
SAFLTVLPDPKPPGPPVASSSSATSLPWPWIGI PAG 

»Tfi?TT ptt t t MT.Pnanvif DPTD2i PA PPT.PfTHT? PPCTFA 

RDRSGDKDLPSLAALS AGPGVGLCEEHGS PAAPQHLL 
GPGPVAGPKLYPKLYTGHSTPHTYTHPPPSCQLNSSH 
S 


2125 


A 


3 


644 


PNWKRNPSLF*KVFPFMKKW/QRGSLLPPKSLDYDR 
FSRN/DTPLGRVSIPLNKVDLTQMQTFWKDLKPCSDG 
SGSRGELLLSLC YNPS ANS 1 1 VNI I KARNLKAM\ DI G 
GTSDP\YVKVWL\MYK\DKRV\EKKKTVT\MKRNLNP 
\ I FNESFAFDI PTEKLRETTI I ITVMDKDKLSRNDVI 
GKIYLSWKSGPGEVKHWKDMIARPRQPVAQWHQLKA 


2126 


A 


193 


883 


IMPCAQRSWLANLSWAQLLNFGALCYGRQPQPGPVR 
FPDRRQEHF I KGLPE YHWGPVRVDASGHFLS YGLHY 
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SEQ 
ED 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, * s =Stop codon, 
/^possible nucleotide deletion,=possib!e nucleotide 
insertion) 










PITSSRRKRDLDGSEDWVYYRI FHEEKDLFFNLTVNQ 
GFLSNSY IMEKRYGNLSHVKMMAS S APLCHLSGTVLQ 
QGTRVGTAALSACHGLTGF FQLPHGDFFI E PVKKH PL 
VEGGYHPHI VYRRQKVPETKE PTCGLKGI VTHMSSWV 
EESVLFFW 


2127 


A 


87 


477 


I KS KLNQQVE VQE S E WRLTEAKGPTMGKE SGWD SGRA 
AVAAWGGWAVGT VLVALS AMGFTS VGI AAS S I AAK 
MMS TAAI ANGGGVAAGSLVAI LQS VGAAGLS VTSKVI 
GGFAGTALGAWIiGS PPS S 


2128 


A 


1993 


1379 


SLHLSERADWQYSQRAG/ DAVEVFFSRTARDNRLGCM 
FVRCAPSSRYTLLFSHGNAVDLGQMCSFYIGLGSRIN 
CNI FS YDYSGYGVSSGKPS EKNLYADI DAAWQALRTR 
YGVSPENI ILYGQSIGTVPTVDLASRYECAAVILHSP 
LMSGLRVAF PDTRKTYCFDAFPS I DKI SKVTS PVLVI 
HGTEDEVI DFSHGLAMYERCPRAVE PLWVEGAGHNDI 
ELYAQYLERLKQFISHELPNS *RQSK 


2129 


A 


1993 


1379 


SLHLSERADWQYSQRAG/ DAVEVFFSRTARDNRLGCM 
FVRCAPSSRYTLLFSHGNAVDLGQMCSFYIGLGSRIN 
CNI F S YD YS GYGVS S GKP S E KNL YAD I DAAWQALRTR 
YGVSPENI I LYGQS IGTVPTVDLASRYECAAVI LHSP 
LMSGLRVAF PDTRKTYCFDAF PS I DKI SKVTS PVLVI 
HGTEDEVIDFSHGLAMYERCPRAVEPLWVEGAGHNDI j 
ELYAQYLERLKQFISHELPNS *RQSK 


2130 


A 


3 


383 


PPGPKGDQGDEGKEGRPGIPGLPGLRGLPGERGTPGL 
PGPKGNDGKLGATGPMGMRGFKGDRGPKGEKGEKGDR 
AGDASGVEAPMMI RLVNGSGPHEGRVEVYHDRRWGTV 
CDDGWDKKDGDWCRM 1 


2131 


A 


3 


383 


PPGPKGDQGDEGKEGRPGI PGLPGLRGL PGERGT PGL 
PGPKGNDGKLGATGPMGMRGFKGDRGPKGEKGEKGDR 
AGDASGVEAPMMI RLVNGSGPHEGRVEVYHDRRWGTV 
CDDGWDKKDGDWCRM 


2132 


A 


1 


2789 


GIRTSSPKTEGKHEETVNKESDMKVPTVSLKVSESVI 
DVKTTMESISNTSTQSLTAETKDIALEPKEQKHEDRQ 
SNTPSPPVSTFSSGTSTTSDIEVLDHESVISESSASS 
RQETTDSKSSLHLMQTSFQLLSASACPEYNRLDDFQK 
LTESCCSSDAFERIDSFSVQSLDSRSVSEINSDDELS 
GKGYALVPI I VNSSTPKSKTVESAEGKSEEVNETLVI 
PTEEAEMEESGRSATPVNCEQPDILVSSTPINEGQTV 
LDKVAEQCEPAESQPEALSEKEDVCKTVEFLNEKLEK 
REAQLLSLSKEKALLEEAFDNLKDEMFRVKEESS S I S 

LATRLNSSETADLLKEKDEQI RGLMEEGEKLS KQQLH 
NSNI IKKLRAKDKENENMVAKLNKKVKELEEELQHLK 
QVLDGKEEVEKQHRENI KKLNSMVERQEKDLGRLQVD 
MDELEEKNRSIQAALDSAYKELTDLHKANAAKDSEAQ 
EAALSREMKAKEELSAALEKAQEEARQQQETLAIQVG 
DLRLALQRTEQAAARKEDYLRHEIGELQQRLQEAENR 
NQELSQSVSSTTRPLLRQIENLQATLGSQTSSWEKLE 
KNLSDRLGESQTLLAAAVERERAATEELLANKIQMSS 
MESQNSLLRQENSRFQAQLESEKNRLCKLEDENNRYQ 
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SEQ 
ED 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue ui 

peptide 

sequence 


Amino acid sequence (X-Unknown, *=Stop codon, 
/-possible nucleotide deletion,=possible nucleotide 
insertion) 










VELENLKDEYVRTLEETRKEKTLLNSQLEMERMKVEQ 
ERKKAI FTQETI KEKERKPFS VS STPTMSRSS S I SGV 
DMAGLQTSFLSQDESHDHSFGPMPI S /AKWKQSL*CC 
KDGSRI KH\ I ENLQSQLKLREGEITHLQLEIGNLEKT 
RS IMAEELVKLTNQNDELEEKVKE I PKLRTQLRDLDQ 
RYNTILQMYGEKAEEAEELRLDLEDVKNMYKTQIDEL 
LRQSLS 


2133 


A 


l 


2234 


MAASSIRDERTRTYYLPWRAPYTCNFRPSSAAVGRL 
GGWGRAQKWNNSGKCRFWE VS E SLTLEDVAVEFTWEE • 
WQLLGPAQKDLYRDVMLENYSNLVSVGYQASKPDALF 
KLEQGE PWTVENE I HSQI C PGMYALYRKKHNGYRVKY 
DSEFQASMVWGVS WNI S P I DEGLLY I YKRHKE FTTEV 
DKGCETNIQMKDDKI KKVDNHLQMHSQKQRCLKRVEQ 
CHKHNAFGNI IHQRKSDFPLRQNHDTFDLHGKI LKSN 
LS LVNQNKRYE I KNSVGVNGDGKS FLHAKHEQFHNEM 
NFPEGGNSVNTNSQFI KHQRTQNIDKPHVCTECGKAF 
LKKSRLI YHQRVHTGEKPHGCS I CGKAFSRKSGLTEH 
QRNHTGEKPYECTECDKAFRWKSQLNAHQKIHTGEKS 

FSKRSRLTEHQRTHTGEKPYECTECDKAFRWKSQLNA 
HQKAHTGEKSYICRDCGKGFIQKGNLIVHQRIHTGEK 
PYICNECGKGFIQKGNLLIHRRTHTGEKPYVCNECGK 
GFSQKTCLISHQRFHTGKTPFVCTECGKSCSHKSGLI 
NHQRI HTGEKP YTCSDCGKAFRDKS CLNRHRRTHTGE 
RPYGCSDCGKAFSHLSCIiVYHKGMLHAREKCVG/ CSQ 
t cxy^l j ,r r <3 * t.t t yt * shtg * rlc * hgdsadafcgs s 
DLIN*QCVPSREQSSHCEPACCQKFSIjSR**NCHGIK 
NHYECR 


2134 


A 


3 


713 


RLAF PCGRPDYWAIaARRT I GTGLERKALGLPGS SERP 
TSVSSYQGTRIRCSNPGGKMRPLTEEETRVMFEKIAK 
YIGENLQLLVDRPDGTYCFRLHNDRVYYVSEKIMKLA 
ANI SGDKLVSLGTCFGKFTKTHKFRLHVTALDYLAPY 
AKYKVWIKPGAEQSFLYGNHVLKSGLGRITENTSQYQ 
GVVVYSMADI PLGFGVAAKSTQDCRKVDPMAI VVFHQ 
AD I GE YVRHEETLT 


2135 


A 


1 


350 


EGGTGVRSLSFYQHIITVGTGHGSLLFYDIRAQKFLE 
ERAS S SLDSMPGPAGRKLKLACGRGWLNQDDVWVNYF 
GGMGEFPNALYTHCYNWPEMKLFVAGGPLPSGLHGNY 
AGLWS 


2136 


B 


238 


1323 


Alio V CjX VOCi Vis. Vi2» VoUilUN X x l\±J v» \jjrv-Xj o v Cixv.vjvj/^. v j. on 

EAERVKGQAMI ATGGVI TGLAALKRQDS ARSQQHVNL 
SPSPATQEKKPIRRRPRADWWRGKIRLYSPSGFFL 
ILGVLISIIGIAMAVLGYWPQKEHFIDAETTLSTNET 
QVIRNEGGVWRFFEQHLHSDKMKMLGPFTMGI GI FI 
FICANAILHENRDKETKIIHMRDIYSTVIDIHTLRIK 
EQRQMNGMYTGLMGETEVKQNGS S CASRLAANTI AS F 
SGFRSSFRMDSSVEEDELMLNESKSSGHLMPPLLSDS 
S VSVFGLYPPPSKTTDDKTSGSKKCETKS IVSSS I S A 
FTLPVI KLNNCVI DE PS I DNI TEDADNLKX 


2137 


A 


41 


1285 


VGEMTLIWRHLLRPLCLVTSAPRIliEMHPFLSLGTSR 
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626 
TABLE 7 



SEQ 
ED 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X-Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possible nucleotide 
insertion) 










TSVTKLSLHTKPRMPPCDFMPERYQVIFLVNSGSEAN 

y-i T T\urr M7V T> TV LI CMNTT FIT T Q T?I? n a Q P YTT tfiT »TNVG 

I YKMELPGGTGCQPTMC PDVFRGPWGGSHCRDS PVQT 
IRKCSCAPDCCQAKDQYIEQFKDTLSTSVAKSIAGFF 
AE P I QGVNG WQ Y PKGFLKEAFELVRARGGVC I ANE V 
QTGFGRLGSHFWGFQTHDVLPDI VTMAKGI GNGF PMA 
AVI TTPE I AKS LAKCLQHFNT FGGNPMACAIGS AVLE 
VI KEENLQENSQEVGT YMLLKFAKLRDE FE I VGDVRG 
KGLMIGIEMVQDKISCRPLPREEVNQIHEDCKHMGLIi 
VGRGS I FSQTFRI APSMC I TKPEVDFAVEVFRS ALTQ 
HMERRAK 


2138 


A 


41 


1285 


VGEMTLIWRHLLRPLCLVTSAPRILEMHPFLSLGTSR 
TSVTKLSLHTKPRMPPCDFMPERYQVI FLVNSGSEAN 

I YKMELPGGTGCQPTMCPDVFRGPWGGSHCRDS PVQT 

IRKCSCAPDCCQAKDQYIEQFKDTLSTSVAKSIAGFF 

AE p IQGVNG WQYPKGFLKEAFELVRARGGVC I ANE V 

r\rpr* T?n dt rcu jpwr* "COTWDITT . PD T VTMAKGI GN G F PMA 
(J ± \j r oivJ-rvjjO fir Yi\jc \£ inuvurux vn inuuj.w«w. ****** 

AVI TTPE I AKS LAKCLQHFNT FGGNPMACAI GS AVLE 
VI KEENLQENSQEVGTYMLLKFAKLRDEFE I VGDVRG 
KGLMIGIEMVQDKISCRPLPREEVNQIHEDCKHMGLL 
VGRGS I FSQTFRI APSMC I TKPEVDFAVEVFRS ALTQ 
HMERRAK 


2139 


A 


3 


362 


t? n xr d a c n r\rnn,Y PANTTiRT? PWHVG I MNHGSHLCGGS I 
LNEWWVLS ASHCFDQLNNS KLEII HGTEDLSTKGI KY 
QKVDKLFLHPKFDDWLLDNDI ALLLLKS PLNLS VNRI 
PICTSEISD 


2140 


A 


1 


663 


EI ANLILAENCEAALALHLYRGGRLLQGHRI PFGVI F 
GGTDVNEDANQAEKNTVMGRVLEEARFAVAFTESMKE 
MAQAQWVDPVFTREVKAKVKRAAGVRLIGEMPQEDLH 
AWKNCFAWNSSVSEGMSAAILEAMDLEVPVLARNI 
PGNAAVVTCHEVTCLLFSNPQEFVHLAKRLVSDPALEK 
EIVVNGREYVRMYHSWQVERDTYQQLIRKLEGSTED 


2141 


A 


8 


1516 


MSLVLLSLAALCRSAVPREPTVQCGSETGPS PEWMLQ 
HDLI PGDLRDLRVE P VTTS VATGDYS I LMNVS WVLRA 
DAS IRLLKATKICVTGKSNFQSYSCVRCNYTEAFQTQ 
TRPSGGKWTFS YIGFPVELNTVYFIGAHNI PNANMNE 
DGPSMSVNFTSPCjCijUrixiyiivx js.js.IS.i-. vis^jjojjwi-'ri.N j. x 
ACIQCNEETVE VNFTTT PLGNR YMAL IQHSTIIGF SQ V 
FE PHQKKQTRAS WI PVTGDS EGATVQLTP YF PTCGS 
DC I RHKGTWLC PQTGVPF PLDNNKSKPGGWLPLLIjL ' 
SLLVATWVLVAGI YLMWRHERI KKTS FSTTTLLPPI K 
VLWYPSEICFHHTICYFTEFLQNHCRSEVILEKWQK 
KKIAEMGPVQWLATQKIs^UUDKWFLLSNDWSVCDGT 
CGKSEGSPSENSQDLFPLAFNLFCSDLRSQIHLHKYV 
WYFRE I DTKDDYNALS VC PKYHLMKDATAFCAELLH 
VKQQVSAGKRSQACHDGCCSL * 


2142 


A 


1 


622 


PDPCLNGGSCVDLVGNYTCLCAE PFKGLRCETGDHPV 
PDACLSAPCHNGGTCVDADQGYVCEYPEGFMGLDCRE 
RVTDDCECRNGGRCLGANTTLCQCPLGFFGLLCEFEI 
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627 
TABLE7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *= ! Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










TAMPCNMNTQC PDGG YCMEHGGS YLCQC PLGF FGLLC 
EFEITAMPCNMNTQCPDGGYCMEHGGSYLCVCHTDHN 
ASHSLPSPCDSDPSSCQAWRNE 


2143 


A 


3 


87 


PAMNVNDSVTKQKFDNLYCCRESILDG 


2144 


A 


406 


888 


IASQNFDPATVSVATAHKGAEPSRGTAWGPVAKRLQQ 
ELMTLMMPGDKRISAYPESLIKWTPSMKQLAQCMKI * 
GISSCWSSSMTTHLYDAPTVKFLTPCYHPNVDTQGNI 
CLDILKEKWSAPYDIRTILLSIQCLLGQLNIDSPLNT 
HATKLWENPIALR 


2145 


A 


46 


1576 


APYLPDPMKHTLALLAPLLGLGLGLALSQLAAGATDC 
KFLGPAEHLTFTPAARARWLAPRVRAPGLLDSLYGTV 
RRFLSWQLNPFPSELVKALLNELASVKVNEVVRYEA 
GYVVCAVIAGLYLLLVPTAGLCFCCCRCHRRCGGRVK 
TEHKALACERAALMVFLLLTTLLLLIGVVCAFVTNQR 
THEQMGPS I EAMPETLLSLWGLVSDVPQ / GVGVS I GS 
AI HTQLRS S V\ TPCLAAVGSLGQVLQVSVHHLQTLNA 
TWELQAGQQDLE PAI REHRDRLLELLQE/ SQVPS VD 
HVLHQLKGVPEANFSSMVQEENSTFNALPALAAMQTS 
SWQELKKAVAQQPEGVRTIiAEGFPGLEAASRWAQAL 
QEVEESSRPYLQEVQRYETYRWIVGCVLCSVVLFVVL 
CNLLGLNLGI WGLS ARDD PSH PE AKGEAGARFLMAGV 
GLSFLFAAPLILLVFATFLVGGNVQTLVCQSWENGEL 
FEFADTPGNLPPSMNLSQLLGLRKNI SI HQ AY 


2146 


A 


3 


717 


DLKDTIGSVTKTPSGLYIIHPEGSSYPFEVMCDMDYR 
GGGWTVIQKRIDGI IDFQRLWCDYLDGFGDLLGEFWL 
GLKKI FYIVNQKNTSFMLYVALESEDDTLAYASYDNF 
WLEDETRFFKMHLGRYSGNAGDAFRGLKKEDNQNAMP 
FSTSDVDNDGCRPACLVNGQSVKSCSHIiHNKTGVmFN 
ECGIiANLNGI HHFSGKLLATGIQWGTWTKNNS PVKI K 
SVSMKIRRMYNPYFK 


2147 


A 


3 


717 


DLKDTIGSVTKTPSGLYIIHPEGSSYPFEVMCDMDYR 
GGGWTVIQKRIDGIIDFQRLWCDYLDGFGDLLGEFWL 
GLKKI F YI WQKNTSFMLYVALESEDDTLAYAS YDNF 
WLEDETRFFKMHLGRYSGNAGDAFRGLKKEDNQNAMP 
FSTSDVDNDGCRPACLWGQSVKSCSHLHNKTGWWFN 
ECGLANLNGIHHFSGKLLATGIQWGTWTKNNS PVKI K 
SVSMKIRRMYNPYFK 


2148 


A 


3 


717 


DLKDTIGSVTKTPSGIjYI IHPEGSSYPFEVMCDMDYR 
GGGWTVIQKRIDGI I DFQRLWCDYLDGFGDLLGiir WL 
GLKKI FYI VNQKNTS FMLYVALESEDDTLAYAS YDNF 

FSTSDVDNDGCRPACLVNGQSVKSCSHLHNKTGWWFN 
ECGLANLNGIHHFSGKLLATGIQWGTWTKNNS PVKI K 
SVSMKIRRMYNPYFK 


2149 


A 


1397 


1565 


DRLESLLEMHI PGVYPNQWNTNFYLFI YFEAESHS VA 
QTGLQ* RHLGSLQLPPPQV 


2150 


A 


836 


633 


MSRNLRTALIFGGFISLIGAAFYPIYFRPLMRLEEYK 
KEQAINRAGIVQEDVQPPGLKVWSDPFGRK* 


2151 


A 


294 


1568 


MSLTI WTVCGVLSLFGALS YAELGTTI KKSGGHYTYI \ 
LEVFGPLPAFVRVWVELLI I RPAATAVI SLAFGRYIL 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










EPFFIQCEI PELAIKLI TAVGITVVMVLNSMSVSWSA 
RIQIFLTFCKLTAILI I I VPGVMQLIKGQTQNFKDAF 
SGRDS S ITRLPLAFY x GMYAYAGWF YLN rVTfcjEVbJN F 
EKTIPLAICISMAIVTIGYVLTNVAYFTTINAEELLL 
SNAVAVTFS ERLLGNFS LAVPI FVALSCFGSMNGGVF 
AVSRLFYVASREGHLPEILSMIHVRKHTPLPAVIVLH 
PLTMIMLFSGDLDSLLNFLSFARWLFIGLAVAGLIYL 
RYKCPDMHRPFKVPLFIPALFSFTCLFMVALSLYSDP 
FSTGIGFVITLTGVPAYYLFIIWDKKPRWFRIMSEKI S 
TRTLQIILEWPEEDKL* 


2152 


A 


217 


378 


KNLFYSLSLICSSYPSILDHIVHIIELIGRIPRRFSL 
SGKYSQDFFSHRGSIVM 


2153 


A 


2046 


4541 


MTLALAYLLAL PQVLDANRCFEKQS PSALSLQLAAYY 
YSLQIYARLAPCFRDKCHPLYRADPKELIKMVTRHVT 
RHEHEAWPEDLI SLTKQLHC YNERLLDFTQAQI IjQGL 
RKGVDVQRFTADDQYKRETILGLAETLEESVYS I AI S 
LAQRYSVSRWEVFMTHLEFLFTDSGLSTLEIENRAQD 
LHLFETLKTDPEAFHQHMVKYIYPTIGGFDHERLQYY 
FTLLENCGCADLGNCAI KPETHIRLLKKFKWASGLN 
YKKLTDENMS PLEALE PVLS SQN I LS I S KL VPKI PEK 
DGQMLSPSSLYTIWLQKLFWTGDPHLIKQVPGSSPEW 
LHAYDVCMKYFDRLHPGDL ITWDAVTFS PKAVTKLS 
VEARKEMTRKAIKTVKHFIEKPRKRNSEDEAQEAKDS 
KVTYADTLNHLEKSLAHLETLSHSFILSLKNSEQETL 
QKYSHLYDLSRSEKEKLHDEAVAICLDGQPLAMIQQL 
LE VAVGPIjD I S PKDI VQo Al MJ\1 J. oiUjol^faAUJ-ikAslr 
RDPLKVLEGWAAVHAS VDKGEELVS PEDLLEWLRPF 
CADDAWPVRPRIHVIiQILGQSFHLTEEDSKLLVFFRT 
EAILKASWPQRQVDIADIENEENRYCLFMELLESSHH 
EAE FQHLVLLLQAWPPMKS E YVI TNNPWVRLATVMLT 
RCTMENKEGLGNEVLKMCRSLYNTKQMLPAEGVKELC 
LLLLNQSLLLPSLKLLLESRDEHLHEMALEQITAVTT 
VNDSNCDQELLSLLLDAKLLVKCVSTPFYPRIVDHLL 
ASLQQGRWDAEELGRHLREAGHEAEAGSLLLAVRGTH 
QAFRTFSTALRAAQHWV* 


2154 


A 


2046 


4541 


MTLALAYLLALPQVLDANRCFEKQS PSALSLQLAAYY 
YSLQI YARLAPCFRDKCHPLYRADPKELI KMVTRHVT 
RHEHEAWPEDLI SLTKQLHC YNERLLDFTQAQI LQGL 
RKGVDVQRFTADDQYKRETILGLAETLEESVYSIAIS 
LAQRYSVSRWEVFMTHLEFLFTDSGLSTLEIENRAQD 
T .WT .T?T?TT JCTTi PF A 7?HOTTMVKY T YPT T GGFDHE RLO YY 
FTLLENCGCADLGNCAI KPETHIRLLKKFKWASGLN 
YKKLTDENMS PLEALEPVLS SQNI LS I SKLVPKI PEK 
DGQMLSPSSLYTIWLQKLFWTGDPHLIKQVPGSSPEW 
LHAYDVCMKYFDRLHPGDLI TWDAVTFS PKAVTKLS 
VEARKEMTRKAI KTVKHFIEKPRKRNSEDEAQEAKDS 
KVTYADTLNHLEKSLAHLETLSHSFILSLKNSEQETL 
QKYSHLYDLSRSEKEKLHDEAVAI CLDGQPLAMIQQL 
LEVAVGPLDISPKDIVQSAIMKI I SALSGGSADLGGP 
RDPLKVLEGWAAVHAS VDKGEELVS PEDLLEWLRPF 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion =possib!e nucleotide 
insertion) 










CADDAWPVRPRIHVLQILGQSFHLTEEDSKLLVFFRT 
EAI LKAS WPQRQVDI ADI ENEENR YCLFMELLE S S HH 
T?A CPHHT ,VT .T .T f> A WP PMKS E YVITNNPWVRLATVMLT 
RCTMENKEGLGNEVLKMCRSLYNTKQMLPAEGVKELC 
LLLI^TQSLLLPSLKLLLESRDEHLHEMALEQITAVTT 
VNDSNCDQELLSLLLDAKLLVKCVSTPFYPRIVDHLL 
ASLQQGRWDAEELGRHLREAGHEAEAGSLLLAVRGTH 
QAFRTFSTALRAAQHWV* 


2155 


A 


2 


362 


QELERSMAQRCVCVLALVAMLLLVF PTVSRSMGPRSG 
EHQRASRI PSQFSKEERVAMKEALKVFPTWSTSFIQ 
HEWEEYSHLFTIQGSDPSLQPYLLMAHFDWPAPEE 
GWEVPPFSG 


2156 


A 


940 




GELVGVGGHFLFLGLALVSKDWRFLQRMI TAPCI LFL 
FYGWPGLFLESARWLIVKRQIEEAQSVLRILAERNRP 
HHnMT .nRPAnPATiOnTjENTCPTjPATS S FS FASLLNYR 
NIWKNLLILGFTNFIAHAIRHCYQPVGGGGSPSDFYL 
CSLLASGTAALACVFL»GVTVDRFGRRGI LLLSMTLTG 
IASLVLLGLWDYLNEAAITTFSVLGLFSSQAAAILST 
LLAAEVI PTTVRGRGLGLIMALGALGGLSGPAQRLHM 
GHGAFLQHWLAACALLCILS IMLLPETKRKLLPEVL 
RDGELCRRPSLLRQPPPTRCDHVPLLATPNPAL* 


2157 


A 


317 


3 


MYALLGVFCLAI LVFLINCAT FALKYRHKQVPLEGQA 
SMTHSHDWVWLGNEAELLESMGDAPPPQDEHTTI IDR 
GPGACEESNHLLLNGGSHKHVQSQIHRSADS 


O 1 C Q 


TV 

A 


J 




T.TjR AR SPi^SERAGVGC^YMLSKGWWKEGRHGGHRRP 
RGWGAAGRRQSVPGGPAAP/ PCTLYSVGADGRGQGHQ 
SRGCRPPGPPSASSAPCLAWGAAGRARREG/ RSGRCR 
TEFSPGCTRR*ALT\CGAGPCRR*SR*RGTRRCLRPW 
ASPGTGAACGRCCCPPP* PHLFWLPPSLRLPAEMLLA 
GSRPT PACRS S PGGS VHTTTGS PAS RRGSRCRGRSRP 
S PRPRPSVLS CHGVSL * TGRGRRRGC PRARGRRA/GV 
APPSCRKSAR\CGGRPALRRAGPPSCALGPGAPPPHI 
WAPETAE PAPAVPC PERPGC PAPAAAPRPLS PDPAQL 
PALARLRPS PGFGERAHAQPA 


2159 


A 


190 


2392 


VPGEECDGITSMS AE SGPGTRLRNLPVMGDGLETSQM 
STTQAQAQPQEANAASTNPPPPETSNPNKPKRQTNQL 
QYLLRVVLKTLWKHQFAWPFQQPVDAVKLNLPDYYKI 
IKTPMDMGTIKKRLENNYYWNAQECIQDFNTMFTNCY 
IYNKPGDDIVLMAEALEKLFLQKINELPTEETEIMIV 
QAKGRGRGRKETGTAKPGVSTVPNTTQASTP PQTQT P 
QPNPPPVQATPHPFPAVTPDLIVQTPVMTWPPQPLQ 
TPPPVPPQPQPPPAPAPQPVQSHPPIIAATPQPVKTK 
KGVKPJCADTTTPTTIDPIHEPPSLPPEPKTTKLGQRR 
ESSRPVKPPKKDVPDSQQHPAPEKSSKVSEQLKCCSG 
I LKEMFAKKHAAYAWPFYKP VDVEALGLHDYCDI I KH 
PMDMSTIKSKLEAREYRDAQEFGADVRLMFSNCYKYN 
P PDHE WAMARKLQDVFEMRF AKMPDE PEE P WAVS S 
PAVPPPTKWAPPSSSDSSSDSSSDSDSSTDDSEEER 
AQRLAELQEQLKAVHEQLAALSQPQQNKPKKKEICDKK 
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TAJ 


fJLE 7 


SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possibIe nucleotide 
insertion) 










EKKKEKHKRKEEVEENKKSKAKEPPPKKTKKNNSSNS 
NVS KKE PAPMKS KP PPTYE S E EEDKCKPMS YE E KRQL 
SLDINKLPGEKLGRWHIIQSREPSLKNSNPDEIEID 
FETLKPSTLRELERYVTSCLRKKRKPQ/ASEKVDVIA 
GSSKMKGFSSSESESSSESSSSDSEDSETGPA 


2160 


A 


108 


440 


MQATSNLLNLIiLLSLFAGLNPSKTHINPKEGWQVYS S 
AQDPDGRGICTWAPEQNLCSRDAKSRQLRQLLEKVQ 
NMSQSIEVLNLRTQRDFQYVLKMETQMKGLKAKFRQI 


2161 


A 


18 


467 


REELGKDLFDCTLYVLLKYDDFNADKHLALEE F YRAF 
QVIQLSLPEDQKLSITAATVGQSAVLSCAIQGTLRPP 
IIWKRNNIILNNLDLEDINDFGDDGSLYITKVTTTHV 
GNYTCYADGYEQVYQTHI FQVNVPPVI RVYPESQARR 
AG 


2162 


A 


79 


415 


MFYQMIWTNGPAKLPASSTKHDLYLCNSFTGPSNI I W 
NLGSRYI FTVI KHGIiGFFLNTILAVLNI AGRNLKCYK 
FC * TGWKLGWS I GPNHL I KHLQT VQQNT IYIRRPSKG 
VAQVRTRGS 


2163 


A 


59 


.447 


I TVDRNTETRTS S FS 1 1 S VPAS ST * GS PSRVI YAKLG 
GEILDYRDLAALPKSKAIYDIDRPDMISYSPYISHSA 
GDRQSYGESPQLLSPTPTEGDQDDRSYKQCRTSSPSS 
TGLVSLGRYTPTSRAPQH 


2164 


A 


3 


493 


DPRVRFTVCGTPTYVAPEILSEKGYGLEVDMWAAGVI 
LYILLCGFPPFRSPERDQDELFNIIQIiGHFEFLPPYW 
DNI SDAAKDLVSRLLWD PKKRYTAHQVLQHPW I ETA 
G/EDQYSETTEAGVPQQRGSLPEPAQEGCGAGI IVTT 
LGI CPAPSSAQGQRKG 


2165 


A 


3 


493 


D PRVRFTVCGTPTYVAPE I LS EKGYGLE VDMWAAGVI 
LYILLCGFPPFRSPERDQDELFNIIQLGHFEFLPPYW 
DNI SDAAKDLVSRLLWDPKKRYTAHQVLQHPWI ETA 
G/ EDQYSETTEAGVPQQRGSLPEPAQEGCGAGI IVTT 
LGI CPAPSSAQGQRKG 


2166 


A 


1334 


470 


SAAQLSLCSRLQLTLYQYTTCPFCSGVRAFLDFHALP 
YQWEVNPERRAEIKFSSYRKVPILVAQEGESSQQLN 
DSSVI ISALKTYLVSGQPLEEI ITYYPAMKAVNEQGK 
EVTEFGNKYWLMLNEKEAQQVYGGKEARTEEMKWRQW 
ADDWLVHLI S PNVYRT PTEALAS FDYI VREGKFGAVE 
GAVAKYMGAAAMYLI SKRLKS RHRLQDNVREDL YEAA 
DKWVAAVGKDRPFMGGQKPNLADLAVYGVLRVMEGLD 
AFDDLMQHTHIQPWYLRVERAITEASPAH 


2167 


A 


996 


214 


GRIRMQRQSTTGGRGIMEGPRGWLVLCVLAI SLASMV 
TEDLCRAPDGKKGEAGRPGRRGRPGLKGEQGEPGAPG 
IRTGIQGLKGDQGEPGPSGNPGKVGYPGPSGPIjGARG 
IPGIKGTKGSPGNIKDQPRPAFSAIRRNPPMGGNWI 
FDTVI TNQEE PYQNHSGRF VCTVPGYYYFTFQVLSQW 
EICLSIVSSSRGQVRRSLGFCDTTNKGLFQWSGGMV 
LQLQQGDQVWVEKDPKKGHIYQGSEADSVFSGFLIFP 
SA 


2168 


A 


3 


420 


LRRFSTDCSSDQQDRLNGTAPSGFNRS*PVPLPHPIL 
E VC PGQ * E PQS AI S LTAFQ VQ AGASRAS PG P P APS S S 
KPGRKAKVASPCPDRPAPPPT*PRPAAAPGSESSPRP 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Un known, *=Stop codon, 
/-possible nucleotide deletion,=possible nucleotide 
insertion) 










PRPRTGRRQQRAHARRAAARTAPWRPSC 


2169 


A 


2744 


496 


ENEEQDSQNEGSTDEKSS PASSQEGS PSGDQQFS PKS 
NTEKSKGELMFDDSSDSSPEKQERNLNWTPAEVPQLA 
AAKRRLPQGKEPGLINLCANVPPVPGNI LPPEVRGNL 
MAAGQNLQS SERS EMI ATWS PAVRTLRNITNNADI QQ 
MNRPSNVAHI LQTLSAPTKNLEQQVNHSQQGHTNANA 
VLFSQVKVTPETHMLQQQQQAQQQQQQHPVLHLQPQQ 
IMQLQQQQQQQISQQPYPQQPPHPFSQQQQQQQQPPP 
SPQQHQLFGHDPAVEIPEEGFLLGCVFAIADYPEQMS 
DKQLLATWKRI IQAHGGTVDP \ PS RVD ARTFS VR VKS 
AAR/ 1 AQAIRERKRCVTAHWLNTVLKjKKKMVPPHRAL 
HFPVAFPPGGKPCSQHIISVTGFVDSDRDDLKLMAYL 
AGAKYTGYLCRSNTVLICKEPTGLKYEKAKEWRI PCV 
NAQWLGDILLGNFEALRQIQYSRYTAFSLQDPFAPTQ 
HLVLNLLDAWRVPLKVSAELLMS IRLPPKLKQNEVAN 
VQP\ SSKRARIED\ VPPPTKKLTP\ ELTPF\ VLFTGF 
E PVQVQQYI \ KKLYI LGGE VAE SAQKCTHL IAS KVTR 
TVKFLA\ AI SWKHI VTPEWIiEECFRCQKFIDEQNYI 
LRDAEAEVLFSFSLEESLKRAHVSPLFKAKYFYITPG 
\ICPSLSTMKAIVECAGGKVLSK\QPSFRKLMGAQAG 
TSSLFGK* F*LSC\ENDLHFIR\E YFARG\ IDVHNAE 
F\VLTEVLTQTLDYESYKV 


2170 


A 


2744 


496 


ENEEQDSQNEGSTDEKSS PAS SQEGS PSGDQQFS PKS 
NTEKSKGEIJ4FDDSSDSSPEKQERNLNWTPAEVPQLA 
AAKRRLPQGKEPGLINLCANVPPVPGNI LPPEVRGNL 
MAAGQNLQS SERSEMI ATWS PAVRTLRNITNNADIQQ 
MNRPSNVAHILQTLSAPTKNLEQQVNHSQQGHTNANA 
VLFSQVKVTPETHMLQQQQQAQQQQQQHPVLHLQPQQ 
IMQLQQQQQQQI SQQP YPQQP PHPFSQQQQQQQQPPP 
SPQQHQLFGHDPAVEI PEEGFLLGCVFAIADYPEQMS 
DKQLLATWKRI I QAHGGTVDP \ PSRVDARTFS VRVKS 
AAR/ IAQAI RERKRC WAHWLNTVLKKKKMVPPHRAL 
HFPVAFPPGGKPCSQHI I SVTGFVDSDRDDLKLMAYL 
AGAKYTGYLCRSNTVLI CKE PTGLKYEKAKEWR I PCV 
NAQWLGDI LLGNFEALRQI QYSRYTAFS LQDPFAPTQ 
HLVLNLLDAWRVPLKVSAELLMS IRLPPKLKQNEVAN 
VQP\ SSKRARIED\VPPPTKKLTP\ELTPF\ VLFTGF 
E PVQVQQYI \ KKLYI LGGEVAE SAQKCTHLI AS KVTR 
TVKFLA\AI SWKHI VTPEWLEECFRCQKFIDEQNYI 
LRDAEAEVLFSFSLEESLKRAHVSPLFKAKYFYITPG 
\ I C P S Ij S 1 MKAI VEC AOi^K V Xj S K \ Q PS FRKiiMGAQ AG 
TS S LFGK* F * LS C \ ENDLHF I R \ E YFARG\ I DVHNAE 
F\VLTEVLTQTLDYESYKV 


2171 


A 


3 


581 


GRRLRSEPRPARPPIARAWPPAPGADGRARRTRVPAP 
CLPRAPCYGVRPRAWRPRPARLRGGLVRWLLSGGPQP 
RRPRATERPS AGTGAAPRRTE PRGRCRGCGRGRG * GP 
RAWGLALCS PHSCSGAAWGPTTGSQRSWPAVARSWQG 
DS SRC PALRTTTVTAGS KAAL PES AAE VS PMS S S PGR 
KRSGFAA 


2172 


A 


70 


993 


SEQKIQEQGYVWITVFSALPTTVSALHPRVLKPLSSL 
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632 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










I HLQANSNP WE CNCKLLGLRDWLAS S AI TIiN I YWQNP 
PSMRGRALRYINI TNCVTS S INVSRAWAWKSPH IHH 
KTTALMMAWHKVTTNGS PLENTETEN I TF WERI PTS P 
AGRFFQENAFGNPLETTAVLPVQIQLTTSVTLNLEKN 
SALPNDAASMSGKTSLICTQEV^KLNEAFDILLAFFI 

T 7\ /-it 7T t T T7T TW\n/nT?VAlfT VIOTTMCDPMDT T7VVCT7V 

QSARYNVTASICNTSPNSLESPGLEQIRLHKQIVPEN 
EAQVI LFEHSAL 


2173 


A 


2 


722 


AVRLNI S YP PQNLTMT VFQGDGT AS TTLRNGS ALS VL 
EGQSLHLVCAVDSNPPARLSWTWGSLTLSPSQSSNLG 
VLELPRVHVKDEGEFTCRAQNPLGSQHISLSLSLQNE 
YTGKMRPI SGVMLGAFGGAGATALVFLSFCI I FVWR 
SCRKXSARPAVGVGDTGMEDANAVRGSASQGPLIESP 

Tk nn G ID 13 tXU A D D A T . A T T> Q T3T?T?r!T7 T OV A Q T . Q T? WTf A P PHY 

PQEQEAI G YE YS E IN I PK 


2174 


A 


2043 


1232 


SHIQHHGRGAQAPVKMVS WMI SRAWLVFGMLYPAYY 
S YKAVKTKNVKE YVRWMMYWI VFAL YTVI ETVADQT V 
AWFPLYYELKIAFVIWLLSPYTKGASLIYRKFLHPLL 
SSKEREIDDYIVQAKERGYETMVNFGRQGLNLAATAA 
VTAAVKSQGAI TERLRS F SMHDLTTI QGDE P VGQR P Y 
QPLPEAKKKSKPAPSESAGYGI PLKDGDEKTDEEAEG 

YKVKKRPQVYF 


2175 


A 


1 


790 


RGYNPNVNAGI INSFATAAFRFGHTLINPILYRLNAT 
LGEI SEGHLPFHKALFS PSRII KEGGI DPVLRGLFGV 
AAKWRAPSYLLSPELTQRLFSAAYSAAVDSAATI IQR 
GRDHGIPPYVDFRVFCNLTSVKNFEDLQNEIKDSEIR 
QKLRKLYGS PGD I DLWPALMVEDLI PGTRVGPTLMC / 
ML/STQFQRLRDGDRFWYENPGVFTPAQLTQLKQASL 
Gp\rr.pnTjnnGTnnvoAnvT? /RKT^OEYPODYLNCKRES 
PNVDPAKC 


2176 


A 


1 


790 


RGYNPNVNAGI INS FATAAFRFGHTLINPILYRLNAT 
LGEI SEGHLPFHKALFS PSRI I KEGGI DPVLRGLFGV 
AAKWRAPS YLLS PELTQRLFS AAYS AAVDSAATI IQR 
GRDHGIPPYVDFRVFCNLTSVKNFEDLQNEIKDSEIR 
QKLRKLYGS PGD I DLWPALMVEDLI PGTRVGPTLMC / 
ML / STQFQRLRDGDRFWYENPGVFTPAQLTQLKQAS L 

PNVDPAKC 


2177 


A 


1 


790 


RGYNPNVNAGI INS FATAAFRFGHTLINPI LYRLNAT 
LGE I SEGHL PFHKALFS PSRI I KEGG I DPVLRGLFGV 
AAKWRAPS YLLS PELTQRLFS AAYSAAVDSAAT I IQR 
GRDHGIPPYVDFRVFCNLTSVKNFEDLQNEIKDSEIR 
QKLRKLYGS PGDI DLWPALMVEDLI PGTRVGPTLMC / 
ML/STQFQRLRDGDRFWYENPGVFTPAQLTQLKQASL 
SRVLCDNGDSIQQVQADVF/RKRQEYPQDYLNCKRES 
PNVDPAKC 


2178 


A 


501 


187 


AGVKWYEHGLWQP P P PGLKRS S HLSL PS S * DHRHE Y P 
CPANF*KIFF\VETRSHYVAQTSLEFLDSSNPPTSAS 
QNAGI \ *GMSHCAQPMQTFSLVKIGTNFLIF 
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TAI 


JLE7 


SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acia sequence ja.— -unicnown, — awp coaon, 
/possible nucleotide deletion,=possible nucleotide 
insertion) 


2179 


A 


4312 


2359 


AEKKMLPVDGEERKSEGSDTEGDRTSPCAVSSLIVSN 
RYPRGGPYII\ATLKDLEVGGSGRRCSDPAGQPSNIjL 
PQRGLGAPL PAETAHTQPS PNDRSLYLS PKS S S AS SS 
LHARQSPCQEQAAVLNSRS IKI SRLNDTI KSLKQQKK 
QVEHQLEEEKKANNEKQKAERELEGQIQRLNIEKKKL 
NTDLYHMKHSLRYFEEESKDLAGRLQRSSQRIGELEW 
SLCAVAATQKKKPDGFSSRSKALLKRQLEQSIREQIL 
LKGHVTQLKESLKEVQLERDQ YAEQI KGERAQWQQRM 
RKMSQEVCTLKEEKKHDTHRVEELERSLSRLKNQMAE 
PL P PDAPAVSS E VELQDLRKE LERVAGE LQAQVENNQ 
CISLLNRGQK\ERLREQEERLQEQQERLREREKRLQQ 
LAEPQSDLEELKHENKSALQLEQQVKELQEKLGQVME 
TLTSAEKEPEAAVPASGTGGESSGLMDLLEEKADIjRE 
HVEKLELGFIQYRRERCHQNVHRLLTEPGDSAKDASP 
GGGHHQAGPGQGGEEGEAAGAAGDGVAACGSYSEGHG 
KFLAAAQNPAAEPSPGAPAPQELGAADKHGDLCEASL 
TNS VE PAQGEAREGS SQDNPTA\ Q P I VQLLGEMQDHQ 
EHPGLGSNCCVPCFCWAWLPRRRR 


2180 


A 


2 


1273 


GGALQCGDPLARS PAVPAPRVPAQP PPGLGRRASRKE 
AATLAMASPPACPSEEDESLKGCELYVQLHGIQQVLK 
DCIVHLCISKPERPMKFLREHFEKLEKEENRQILARQ 
KSNSQSDSHDEE VS PTPPNP WKARRRRGGVS AE VYT 
EEDAVS YVRKVI PKDYKTMTALAKAI SKNVLFAHLDD 
NERSDI FDAMFPVTHI AGETVIQQGNEGDNF YWDQG 
EVDVYVNGEVAn^ISEGGSFGELiAJbl xvji FKAA1 VKA 
KTDLKLWGIDRDSYRRILMGSTLRKRKMYEEFLSKVS 
ILESLEKWERLTVADALEPVQFEDGEKIWQGEPGDD 
FYII TEGTAS VLQRRS PNEE YVEVGRLGPSD YFGE I A 
LLLNRPRAATWARGPLKCVKLDRPRFERVLGPCSEI 
LKRNIQRYNSFISLTV 


2181 


A 


1 


303 


PTRPLERGPSGLGMGLIDGMHTHLGAPGLYIQTLLPG 
S PAAADGRLSLGDRI LEVNGS SLLGLGYLRAVDLIRH 
GGKKMRFLVAKSDVETAKKIHFRTPPL 


2182 


A 


2227 


332 


MGKYTVRVATGDLLLAGS PNLVQLWLVGEHGEADLGK 
QL P PVWGKE AEFE I DVPLHLGRLLMVKLRKHNVLLSL 
DWFCKWISVQGPGTQGAAFFPCYRWVQGHGIICLPEG 
T/RWGSWKDGLILPIAGNRQPDLPRDERFLEDKDLDF 
NVSLAKGLKDLAI KGTLDFINCVKRLEDFKKI F PHGK 
TVLAERVYDSWKNDAFFGYQFLNGANPMLLRCSSRLP 
ACLVLPPGMEDLKTQLEKELQAGSLFEVDFSLLDGVK 
PNVT I FKQQCVAAPLWLKLQ PDGGLL PMVI QLQ P P * 

ur^rmrmT T t?t.DCUDDMRWT.T.A V'TIaHTT? QQnPOT/^nT/^Q 
HvjL. r* xr Jtjj Jj r J-i c Q ti f xrflft, W JjJjAJ\.J. w vrcooiJr yj-n^yi-iyo 

hllrghliaeviava™rslpslhpiyklliphfryt 
maintlaqs s lvsewgi fdl wstgsgshvdi lqram 
acltyhslcppddladrglldvkssfyg*dairlwgi 
i sre * \ yvegmvglf ynsdqamkddlelqawcremte 
tglqraqdqgfli slesraqlchfvtmci ftctgqha 

SNHLGQLDWYSWI PNGPCTMQKP PPI SKDVTEKDI VD 
LLPNLHQARMQKTFTKFLGRRQPVMHEEKYFSGPE PQ 
AVLRQFQEEIiASMDKEIEVRNAVLNLPCEYL- PSMVE 
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TABLE 7 



ID 




beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Prpdirtpfi 

1 1 CUIvlvU 

ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X = Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










\NSVTI 


2183 


A 


2227 


332 


MGKYTVRVATGDIiLLAGS PNL VQLWLVGEHGEADLGK 
QLPPWGKEAEFEIDVPLHLGRLLMVKLRKHNVLLSL 
DWFCKWI SVQGPGTQGAAFFPCYRWVQGHGI ICLPEG 
T/RWGSWKDGLILPIAGNRQPDLPRDERFLEDKDLDF 
NVSLAKGLKDLAIKGTLDF INCVKRLEDFKKI F PHGK 
TVLAERVYDS WKNDAFFGYQFLNGANPMLLRCS SRLP 
ACLVLPPGMEDIiKTQLEKELQAGSLFEVDFSLLDGVK 
PNVI I FKQQCVAAPLWLKLQ PDGGLLPMVIQLQPP * 

HLLRGHL I AE VI AVATMRS L PSLHPI YKLL I PHFR YT 
MAINTIiAQS SLVSE WGI FDLWSTGSGSHVD I LQRAM 
ACLTYHSLCPPDDLADRGLLDVKSSFYG*DAIRLWGI 
I SRE* \ YVEGMVGLFYNSDQAMKDDLELQAWCREMTE 
TGLQRAQDQGFLI SLESRAQLCHF VTMC I FTCTGQHA 
SNHLGQLDWYSWI PNGPCTMQKPPPISKDVTEKDIVD 
LLPNLHQARMQKTFTKFLGRRQPVMHEEKYFSGPEPQ 
AVLRQFQEEIiASMDKEIEVRNAVLNLPCEYL* PSMVE 
\NSVTI 


2184 


A 


2227 


332 


MGKYTVRVATGDLLLAGS PNLVQLWLVGEHGEADLGK 
QLPPVWGKE AE FE I DVPLHLGRLLMVKLRKHNVLLS L 
DWFCKWI SVQGPGTQGAAFFPCYRWVQGHGI ICLPEG 
T/RWGSWKDGLILPIAGNRQPDLPRDERFLEDKDLDF 
NVSLAKGLKDLAI KGTLDF INCVKRLEDFKKI F PHGK 
TVLAERVYDSWKNDAFFGYQFLNGANPMLLRCSSRLP 
ACLVLPPGMEDLKTQLEKELQAGSLFEVDFSLLDGVK 
PNVI I F KQQCVAAPLWLKLQ PDGGLL PMVI QLQPP * 
UdC PPPT .T ,FT .P QMP PM AWT J • AKTWVRS SDFOLOOLOS 

n 1 n 1 1 1 t n Jj front C L V ITXVV XJJJTXXVJL »» V nUUWi. w-'-'Se: Vi - LJ V: ' 

HLLRGHLIAEVI AVATMRSLPSLHPI YKLLI PHFRYT 
MAINTLAQS SLVSEWGI FDLWSTGSGSHVD I LQRAM 
ACLTYHSLC PPDDLADRGLLDVKS S F YG * DAI RLWGI \ 
I SRE * \ YVEGMVGLFYNSDQAMKDDLELQAWCREMTE 
TGLQRAQDQGFLI SLESRAQLCHFVTMC I FTCTGQHA 
SNHLGQLDWYSWI PNGPCTMQKPP PI SKDVTEKD I VD 
LLPNLHQARMQKTFTKFLGRRQPVMHEEKYFSGPEPQ 
AVLRQFQEELASMDKEIEVRNAVLNLPCEYL* PSMVE 
\NSVTI 


2185 


B 


1 


1110 


iuir*T t t r*T r*AT naopuozi PCapr , T?\/WRT7PRftlP'RPfiTiPT 
VLAGQREFWVGVGSAALHSERPAGPTTPGSKGLSTQV 
SSCGGRTGS PSSAS PLALRS I SRWGLS HLPHGAGLRT 
CS PAMPKP PHS AVGS CATRAS LI STAPRSRAPGP I DH 
PRAETCQRTVQELAGSSTCSPVQDPLGEASWAPEFEG 
SGPKRRANGRGAYGLRDTGVHS SGVAARSPAAAERWV 
QGF PKQNVHF VNDNT I C YPCGNYVI FI NI ETKKKTVL 
QCSNG I VGVMATNI PCE WAF S DRKLKPL I YVYS F PG 
LTRRTKLKADQERDPFLYLFQVAEFLTQGCLQISAFS 
PTSQRYQALLGQMWDLIRGHRFSVEKSVETSSSCSA j 


2186 


A 


22 


960 


ARPGPDMAALYACTKCHQRFPFEALSQGQQLCKECRI 
AHPWKCTYCRTE YQQE S KTNT I CKKCAQNVQLYGTP 
KPCQYCNIIAAFIGNKCQRCTNSEKKYGPPYSCEQCK 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possible nucleotide 
insertion) 










RKHLSSSSRAGHQEKEQYSRLSGGGHYNSQKTLSTSS 

J. l^IN Ci X. JrlVLVrvOXVC uOX X X IN OL/ O J» O irXJJ-fcriJJi-'CJ rvjiuin v 

I IAQLKEEVATLKKMLHQKDQMILEKEKKITELKADF 
OYOESOMPJVKMMOMEKTHKEVTEOLOAKNREIjLKOAA 
ALSKSKKSEKSGAITSP 


2187 


A 


biz 


pio 


RSGRTWTG I GYS KALQS SNRNTKS LLQNE FMMVYS F 
RALSFKESTWATFQHGGEATKSRSLSSTQ 


2188 


A 


bx« 


oxz 


RSGRTWTGIGYS KALOS SNRNTKS LLQNE FMMVYS F 
RALSFKESTWATFQHGGEATKSRSLSSTQ 


2189 


A 


612 


812 


RSGRTVVTGIGYSKALQSSNRNTKSLLQNEFMMVYSF 
RALSFKESTWATFQHGGEATKSRSLSSTQ 


2190 


A 


612 


812 


RSGRTWTGIGYS KALQS SNRNTKSLLQNE FMMVYS F 
P A T . Q T7V"R Q T W A T AT KS RS LS S TO 


2191 


A 


612 


812 


RSGRTVVTGIGYSKALQSSNRISTTKSLLQNEFMMVYSF 
RALS FKE STWATFQHGGEATKSRS LS STQ 


2192 


A 


936 


745 


RRNSPGLCFLLPSLFHLRLLWRLLLWHQVFFDVAIFV 
IGGICSVSGFVHSLEGLIEAYRTNAED 


2193 


A 


122 


643 


MPSGCRCLHLVCLLCILGAPGQPVRADDCSSHCDLAH 
GCCAPDGSCRCDPGWEGLHCERCVRMPGCQHGTCHQP 
WQC I CHSGWAGKFCDKDEHI CTTQS PCQNGGQCMYDG 
GGEYHCVCLPGFHGRDCERKAGPCEQAGSPCRNGGQC 
QDDOGFALNFTCRCLVGFVGARCDV* 


2194 


A 


1 


1406 


NWSRAPPAPVEDLSKVSYEELLQWSKEELIRSLRRA 
E AEKVS AMLDHSNLI RE VNRRLQLHLGE I RGLKDINQ 
KLQEDNQELRDLCCFLDDDRQKGKRVSREWQRLGRYT 
a r'TTMUifcnrn T .VT OVT »TmT .FVTCiTEEVVKENMEL / KELC 
VLLDEEKGAG\ SQAAAAPSTARPACANSQP/ PTAPYV 
RDVGDGSSTSSTGSTDSPDHHKHHASSGSPEHLQKPR 
SEGSPEHSKHRSASPEHPQKPRACGTPDRPKALKGPS 
PEHHKPLCKGS PEQQRHPHPGS S PETLPKHVLSGS PE 
HFQKHRSGS S PEHARHSGGS PEHLQKHALGGSLEHLP 
RARGTS PEHLKQHYGGS PDHKHGGGSGGSGGSGGGSR 
EGTLRRQAQEDGS PHHRNVYSGMNESTLS YVRQLEAR 
VRQLEEENRMLPQASQNTGRP PTKNS SHMEKGWGSRA 
RRVLHWWQGCRGI GRCLATLTGS FRWS S 


2195 


A 


1461 


197 


GVTHLFLFGKRKLRNGI AEDLKGQADFFFLLVS EAW 
ATGSPRAWLTCLILPLPGIIFSVLPKAMSRPLLITFT 

IGDESSAPDSQRSQTEPARERKRKKRRIMKAPAAEAV 
AEGASGRHGQGRSLEAEDKMTHRILRAAQEGDLPELR 
RLLE PHEAGGAGGNI NARD AFWWT PLMCAARAGQGAA 
VSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARM 
VRESHGETRSPENRSPTPSLQYCENCDTHFQDSNHRT 
STAHLLSLSQGPQPPNLPLGVPISSPGFKLLLRGGWE 
PGMGLGPRGEGRANPI PTVLKRDQEGLGYRSAPQPRV 
THFPAWDTRAVAGRE\TPPRVATLSWREERRREE\ KD 
RAWERDLRT YMNLE F 


2196 


A 


10 


768 


SFAGAAARPSTPPASGRGAAPGRPGPSPMDLRAGDSW 
GMLACLCTVLWHLPAVPALNRTGDPGPGPSIQKTYDL 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino * 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/-possible nucleotide deletion,=possible nucleotide 
insertion) 










TRYLEHQIiRSLAGTYLNYLGPPFNEPDFNPPRLGAET 
LPRATVDLEVWRSLNDKLRLTQNYEAYSHLLCYLRGL 
NRQAATAELRRSIiAHFCTSLQGLLGS IAGVMAALGYP 
LPQPLPGTEPTWTPGPAHSDFLQKMDDFWLLKELQTW 
LWRSAKDFNRLKKKMQP PAAAVTLHLGAHGF 


2197 


A 


1 


1054 


P P I ARLQE F Ga b KKHMAA P b(j V HJjIi V KKubrlKlr bbF 
LNHIYLHKQSSSQQRRNFFFRRQRDISHSIVLPAAVS 
SAHPVPKHI KKPDYVTTGI VPDWGDSIEVKNEDQIQG 
LHQACQLARHVLLLAGKSLKVDMTTEEIDALVHREII 
SHNAYPSPLGYGGFPKS VCTS VNNVLCHGI PDSRPLQ 
DGDI INIDVTVYYNGYHGDTSETFLVGNVDECGKKLV 
EVARRCRDEAIAACRAGAPFSVIGNTISHITHQNGFQ 
VCPHFVGHGIGSYFHGHPEIWHHANDSDLPMEEGMAF 
TIEPIITEGSPEFKVLEDAWTWSLD/TSKVSAQFEH 
TVLI TSRGAQ I LTKL PHEA 


2198 


A 


2319 


957 


S PGT P AAGRT S RT VQT P F * SRTPLALMIGSENWPGLQ 
/ FPAKWAP* ANHLTFAGLTPNHSGTK\ WAGI SGTRLS 
LPGAGAAAPEVPRRCRRHCPECLQPAGNAAPEQSGGC 
RJjAFJj * ARS To b RAKCalj IAj b & VKK FCj V Alab yKAKJjljl 
P*LPFLLGVSSPSPKSGSRTAAMHQPRLSSPIQRRRK 
CSGEREASHYEPALSKAVRSVGGSPKSASGDAGRARS 
\SRAPNSESSNMAARLAIEREEKAGD*QAARRRRGPP 
PPFTSGI * SRLPEAGTMSA*QPTLEFGG/ SLP* SKGN 
SSHSKELEAS PS WGRQ PG AV\ SGNCGMCPWGPEKTE 
GRCSRPVTTAWCSLCSSCCCPMTSLSIPSQNCSKRLL 
SSSLCSSSSRILQSSSTSSSFSSCSSTPSSSRLAWST 
SYS I S SKGPSS * QLCTLPS AS PFMSGS * TYAGKTPTA 
SYGQMDFKCCLYSRD 


2199 


A 


1 


3349 


MDQPEAPCS STGPRLAVARELLLAALEELSQEQLKRF 
RHKLRDVGPDGRS I PWGRLERADAVDLAEQLAQF YGP 
EPALEVARKTLKRADARDVAAQLQERRLQRLGLGSGT 
LLSVSEYKKKYREHVLQLHARVKERNARSVKITKRFT 
KLLIAPESAAPEEALGPAEEPEPGRARRSDTHTFNRL 
FRRDEEGRRPLTWLQGPAGIGKTMAAKKI LYDWAAG 
KLYQGQVDFAFFMPCGELLERPGTRSLADLILDQCPD 
RGAPVPQMLAQPQRLLFILDGADELPALGGPEAAPCT 
DPFEAASGARVLGGLLS KALLPTALLLVTTRAAAPGR 
LQGRLCSPQCAEVRGFSDKDKKKYFYKFFRDERRAER 
AYRFVKENETLFALCFVPFVCWIVCTVLRQQIiELGRD 
LSRTSKTTTSVYLLFITSVLSSAPVADGPRLQGDLRN 

KKELPGVLETEVTYQFI DQS FQE FLAALS YLLEDGGV 
PRTAAGGVGTLIiRGDAQPHSHLVLTTRFLFGLLSAER 
MRDI ERHFGCMVSE RVKQEALRWVQGQGQGC PGVAPE 
VTEGAKGLEDTEEPEEEEEGEEPNYPLELLYCLYETQ 
EDAFVRQALCRFPELALQRVRFCRMDVAVLSYCVRCC 
PAGQALRLI SCRLVAAQEKKKKSLGKRLQAR\LGGGS 
WLGTQLAPEVPFRPPCCDICPTPPPDPRLLQGKAFAR 
VPLNIAPIQPLPRGLASVERMNVTVLAGAGPGDPKTH 
AMTDPLCHIiSSLTLSHCKLPDAVCRDLSEALRAAPAL 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide de!etion,=possibIe nucleotide 
insertion) 










TELGLLHNRLSEAGIiRMLSEGLAWPQCRVQTVRVQLP 
DPQRGLQYLVGMLRQS PALTTLDLSGCQLPAPMVTYL 
CAVLQHQGCGLQTLS LS LPSD PTPS S FSGRCRE PGRR 
IAjJjJCj S RW PKo fa P b CjHQRGE DPGGGGkGKGRR EbAK 
EGTPGPRAPPTAAPGRSSGSRLELCSLRALRAGNARP 
PDATHAAAASGDRGEPGPRPRVHVPPPGPAQRPPPPP 
RDRPRLPATARALGAGTADLPGGAAAGRLLL P PGPGV 

c*y KJJ 1 Lib liAvjtfUCKt'tj^AAAAyAyyjjH^ 

CPLSAQ 


2200 


A 


877 


446 


GIRCRFGTS E I RAHATAKATVAAFTASEGHAHPRWE 
LPKTDEGLGFNIMGGKEQNSPIY I SRVI PGGVADRHG 
GLKRGDQLLSVNGVSVEGEQHEKAVELLKAAQGSVKL 
VWYTPRVLEEMEARFEKMRSARRRQQHQSYS 


2201 


A 


48 


474 


S CLARPFRAQVSS SGFRAQNF PGVGS WAVAVGAGMAQ 
LEGYCFSAALSCTFLVSCLLFSAFSRALREP\YMDEI 
FHLPQAQRYCEGHFSLSQWDPMITTLPGLYLVSVGVV 
KPAI WI FGWSEHWCS IGMLRFVNLLFS VGNF 


2202 


A 


3140 


1502 


FRRLHSVPRGSALCAMDGIVPDIAVGTKRGSDELFST 
CVTNGPFIMSSNSASAANGNDSKKFKGDSRSAGVPSR 
VIHIRKLPIDVTEGEVISLGLPFGKVTNLLMLKGKNQ 
AF I EMNTEE AANTMVNYYT S VT P VLRGQ P I Y I Q F SNH 
KELKTDS S PNQARAQAALQAVNS VQSGNLALAAS AAA 
VDAGMAMAGQS PVLRI I VENLFY PV i LDVLHQI b biU? 
GTVLKI ITFTKNNQFQALLQYADPVSAQHAKLSLDGQ 
NIYNACCTLRIDFSKLTSLNVKYNNDKSRDYTRPDLP 
SGDSQPSLDQTMAAAFGLSVPNVHGALAPLAI PSAAA 
AAAAAGRI AI PGLAGAGNS VLLVSNLNPERVTPQSLF 
I LFGVYGDVQRVKI LFNKKEN ALVQMADGNQAQLAMS 
HLNGHKLHGKPIRITLSKHQNVQLPREGQEDQGLTKD 
YGNSPLHRFKKPGSKNFQNIFPPSATIiHLSNIPPSVS 
EEDLKVLFSSNGGWKGFKFFQKDRKMALIQMGSVEE 
AVQALIDLHNHDLGENHHLRVSFSKSTI 


2203 


A 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSWVAGGF 
GNAAS ARHHGLLAS ARQ PGVCHYGTKLACCYGWRRN S 
KGVCEATCEPGCKFGECVGPNKCRCFPGYTGKTCSQD 
VNECGMKPRPCQHRCVNTHGS YKCFCLSGHMLM PDAT 
CVNSRTCAMINCQYSCEDTEEGPQCLCPSSGIiRLAPN 
GRDCLD I DE C ASGKVI C P YNRRC VNTFGS YYCKCHI G 
FELQYISGRYDCIDINECTMDSHTCSHHANCFNTQGS 
FKCKCKQGYKGNGLRCS AI PENS VKEVLRAPGT I KDR 

NYEEIVSRGGNSHGG\KKGNEEKMKEGLEDEKREEKA 
LKD*HRRERPFRG\DVFFPKVNEAGEFGLIL\VQRKA 
LTS KLEHKADLNI S VDCS FNHG\ I CDW\ KQDR \ EDDF 
D W\NPADR\ DNAI \GFY\MAVPGLWQGHK\ KD I GRLK 
LLLPDLQPQSNFCLLFDYRLAGDKVGKLRVFVKMSNN 
AliAWEKTTSEDEKWKTGKIQL YQGTDATKS 1 1 FEAER 
GKGKTGEIAVDGVLLVSGLCPDSLLSVDD 


2204 


A 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSWVAGGF 
GNAAS ARHHGLLAS ARQ PGVCHYGTKLACCYGWRRN S 
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TABLE 7 



SEQ 

n> 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










KGVCEATCE PGCKFGECVGPNKCRCF PGYTGKTCSQD 
WECGMKPRPCQHRCWTHGSYKCFCLSGHMLMPDAT 
CVNSRTCAMINCQYSCEDTEEGPQCLCPSSGLRLAPN 
GRDCLDIDECASGKVICPYNRRCVNTFGSYYCKCHIG 
FELQ YI S GRYDC I D I NECTMD S HTCS HHANC FNTQGS 
FKCKCKQGYKGNGLRCS AI PENS VKEVLRAPGTI KDR 
IKKLLAHKNSMKiCKAKIKNVTPEPTRTPTPKVNIiQPF 
NYEEIVSRGGNSHGG\KKGNEEKMKEGLEDEKREEKA 
LKD*HRRERPFRG\DVFFPKVNEAGEFGLIL\VQRKA 
LTSKLEHKADLNI SVDCSFNHG\ ICDW\KQDR\EDDF 
DW\NPADR\DNAI \GFY\MAVPGLWQGHK\ KDI GRLK 
LLLPDLQPQSNFCLLFDYRLAGDKVGKLRVFVKNSNN 
ALAWEKTTS EDEKWKTGKI QLYQGTDATKS 1 1 FEAER 
GKGKTGE I AVDGVLLVSGLCPDSLLSVDD 


2205 


A 


2814 


346 


VKKTKSIFNSAMQEMEVYVENIRRKFGVFNYSPFRTP 
YTPNSQYQMLLDPTNPSAGTAKIDKQEKVKLNFDMTA 
SPKILMSKPVLSGGTGRRISLSDMPRSPMSTNSSVHT 
GSDVEQDAEKKATSSHFSASEESMDFLDKSTAS PAST 
KTGQAGSLSGSPKPFSPQLSAPITTKTDKTSTTGSIL 
NLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQIR 
SRFQLNLDKTI ESCKAQLGINE I SEDVYTAVEHSDSE 
DSEKSDSSDSEYISDDEQKS*GTSQEDTBDKEGCQMD 
KEPSAVKKKPKPTNPVEIKEELKSTSPASEKADPGAV 
KDKASPEPEKDFSGKAKPSPHPIKDKLKGKDETDSPT 
VHLGLDSDS E \NELVI DLGEDHSGREGRKNKKE PKEP 
SPKQDWGKTPPSTTVGSHSPPETPVLTRSSAQTSAA 
GATATTSTS STVTVTAPAPAATGS PVKKQRPLLPKE\ 
TAPAVQRS CGTS S TVQQKEI TQS P S TS T I TLVTSTQS 
S PLVTS SGS MS TLVS S VNGDL P I GTASADVAAD I AKY 
TSKL\MDAIKGTM\TEIYNDLSKN\TTWKAQLAEDSQ 
GLRIEIEKLQWLHQQEL\SEMKKNLELTMAEMRQSWE 
QERDRLIAEVKKQLELEKQQAVDETKKKQWCANFKKE 
AIFYCCWNTSYCDYPCQ\QAHWPEH\MKSCTQSATAP 
Q\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\ 
SKEKETSAEKSKESGSTLDLSGSRETPSSILLGSNQG 
SDHSR\SNKSSWSSSDEKRGS\TRSDHN/TPSTQHGR 
SLLPGKESRAGTPFLGTSK 


2206 


A 


2814 


346 


VKKTKS I FNS AMQEME VYVEN I RRKFGVFNYS PFRTP 
YTPNSQYQMLLDPTNPSAGTAKIDKQEKVKLNFDMTA 
SPKILMSKPVLSGGTGRRISLSDMPRSPMSTNSSVHT 
uSU VliyUAi2jivl\Al oorlr oAociCior'iiJr jjuivo xt\*Dtrt\o x 
KTGQAGSLSGSPKPFSPQLSAPITTKTDKTSTTGSIL 
NLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQIR 
SRFQLNLDKTI ESCKAQLGINEI SEDVYTAVEHSDSE 
DSEKSDSSDSEYISDDEQKS*GTSQEDTEDKEGCQMD 
KE PS AVKKKPKPTN PVE I KE E LKSTS PAS E KAD PGAV 
KDKAS PE PE KDFSGKAKPS PHP I KDKLKGKDETDS PT 
VHLGLDSDSE\NELVI DLGEDHSGREGRKNKKE PKEP 
SPKQDWGKTPPSTTVGSHSPPETPVLTRSSAQTSAA 
GATATTSTS ST VTVTA PAPAATGS PVKKQRPLL PKE \ 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










TAPAVORSCGTSSTVOOKEITOSPSTSTITLVTSTOS 
SPLVTSSGSMSTLVSSVNGDLPIGTASADVAADIAKY 
TSKL\MDAIKGTM\TEIYNDLSKN\TTWKAQLAEDSQ 
GLRIEIEKLQWLHQQEL\SEMKHNLELTMAEMRQSWE 
QERDRLIAEVKKQLELEKQQAVDETKKKQWCANFKKE 
AIFYCCWNTSYCDYPCQ\QAHWPEH\MKSCTQSATAP 
Q\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\ 
SKEKETSAEKSKESGSTLDLSGSRETPSSILLGSNQG 
SDHSR\SNKSSWSSSDEKRGS\TRSDHN/TPSTQHGR 
SLLPGKESRAGTPFLGTSK 


2207 


A 


2814 


346 


VKKTKS I FNS AMQEME VYVENI RRKFGVFNYS PFRTP 
YTPNSQYQMLLDPTNPSAGTAKIDKQEKVKLNFDMTA 
S PKI LMSKP VLSGGTGRRI SLSDMPRS PMSTNS S VHT 
GSDVEQDAE KKATS SHFSAS EESMD FLDKSTAS PAST 
KTGQAGSLSGS PKPFS PQLS AP I TTKTDKTSTTGS I L 
NLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQIR 
SRFQLNLDKTIESCKAQLGINEISEDVYTAVEHSDSE 
DSEKSDSSDSEYISDDEQKS*GTSQEDTEDKEGCQMD 
KEPSAVKKKPKPTNPVEIKEELKSTSPASEKADPGAV 
KDKAS PEPE KDFSGKAKPS PHPI KDKLKGKDETDS PT 
VHLGLDSDSE\NELVIDLGEDHSGREGRKNKKEPKEP 
SPKQDWGKTPPSTTVGSHSPPETPVLTRSSAQTSAA 
GATATTS TS STVTVTAPAPAATGS PVKKQRPLL PKE \ 
TAPAVORSCGTS STVOOKEITQS PSTSTI TLVTSTQS 
S PLVTS SGSMSTLVS S VNGDLP I GTASADVAAD I AKY 
TS KL\MDAI KGTM\TE I YNDLS KN \ TTWKAQLAEDSQ 
GLRIEIEKLQWLHQQEL\SEMKHNLELTMAEMRQSWE 
QERDRLI AEVKKQLELEKQQAVDETKKKQWCAN FKKE 
AIFYCCWNTSYCDYPCQ\QAHWPEH\MKSCTQSATAP 
Q\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\ 
SKEKETSAEKSKESGSTLDLSGSRETPSSILLGSNQG 
SDHSR\SNKSSWSSSDEKRGS\TRSDHN/TPSTQHGR 
SLLPGKESRAGTPFLGTS K 


2208 


A 


2814 


346 


VKKTKS I FN SAMQEMEVYVENI RRKFGVFNYS PFRTP 
YT PNSQ YQMLLDPTN PS AGTAKI DKQEKVKLNFDMTA 
S PKI LMSKP VLSGGTGRRI SLSDMPRS PMSTNSS VHT 
GSDVEQDAE KKATSSHFSASEESMDFLDKSTAS PAST 
KTGQAGSLSGS PKPFS PQLSAP I TTKTDKTSTTGS I L 
NLNLDRS KAEMDLKELSE S VQQQSTPVPLI S PKRQ I R 
SRFQLNLDKTIESCKAQLGINEISEDVYTAVEHSDSE 
DSEKSDSSDSEYISDDEQKS*GTSQEDTEDKEGCQMD 
KE PS AVKKKPKPTN PVE I KEELKSTS PASEKAD PGAV 
KDKAS PEPE KDFSGKAKPS PHPI KDKLKGKDETDS PT 
VHLGLDSDS E \NELVI DLGEDHSGREGRKNKKE PKE P 
SPKQDWGKTPPSTTVGSHSPPETPVLTRSSAQTSAA 
GATATTSTS STVTVTAPAPAATGS PVKKQRPLL PKE \ 
TAPAVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQS 
S PLVTS SGSMSTLVS SVNGDL PI GTASADVAAD I AKY 
TS KL\MDAI KGTM\ TE I YNDLS KN\ TTWKAQLAEDSQ 
GLRI E I EKLQWLHQQEL\ SEMKHNLELTMAEMRQS WE 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deIetion,=possible nucleotide 
insertion) 










QERDRLIAEVKKQLELEKQQAVDETKKKQWCANFKKE 
AIFYCCWNTSYCDYPCQ\QAHWPEH\MKSCTQSATAP 
A\ OT?an2XT?\ VMTRTT.MfCCJ cjnn^ 0 ; ^STOS APSETASA\ 
SKEKETSAEKSKESGSTLDLSGSRETPSSILLGSNQG 
SDHSR\SNKSSWSSSDEKRGS\TRSDHN/TPSTQHGR 
SLLPGKESRAGTPFLGTSK 


2209 


A j 


1 


575 


GGI PHYLRGVNNARQPWHNADVRLRYGLRPGNATEEG 
LASLHSVLFRKQPFLWRAALLYYTIHRAARMSFRQLF 
QDLERYVQDADVRWEYCVRAKRGQTDTSIjPGCFSKDQ 
VYLDGIVRILRHRQTIDFPLLTSLGKVSYEDVDHLRP 
HGVIiDNTRVPHFMQDLARYRQQLEHIMATNRLDEAEL 
GRLLPD 


2210 


A 


3 


1795 


LGLGSGTLLSVSEYKKKYREHVLQLHARVKERNARSV 
KITKRFTKLLIAPESAAPEEALGPAEEPEPGRARRSD 
THTFNRLFRRDEEGRRPLTVVLQGPAGIGKTMAAKKI 
LYDWAAGKLYQGQVDFAFFMPCGEIiLERPGTRSLADL 
I LDQC PDRGAP VPQMLAQ PQRLLF I LDGADEL P ALGG 
PEAAPCTDPFEAASGARVLGGLLSKALLPTALLLVTT 
RAAAPGRLQGRLCS PQCAE VRGFSDKDKKKYF YKFFR 
ni?DDaT?pavp T?VTTR , MT?TT . F a T ,C F VP P VP W T VPT VTiRO 
QLELGRDLSRTSKTTTSVYLLFITSVLSSAPVADGPR 
LQGDLRNLCRLAREGVLGRRAQFAEKELEQLELRGSK 
VQTLFLSKKELPGVLETEVTYQFIDQSFQEFIjAALSY 
LLEDGGVPRTAAGGVGTLLRGDAQPHSHLVLTTRFLF 
GLLSAERMRDIERHFGCMVSERVKQEALRWVQGQGQG 
CPGVAPEVTEGAKGLEDTEEPEEEEEGEEPNYPLELL 
YCLYETQEDAFVRQALCRFPELALQRVRFCRMDVAVL 
S YCVRC C PAGQALRL I S CRLVAAQEKKKKS LGKRLQA 
SLGGG 


2211 


A 


2 


1177 


GFVEAGEECYCVS\GQECRDLCCFAHNCSLRPGAQCA 
unnr^/PPT.T.KPArSAT.rRnAMQDrDTiPEFCTGTSSHC 
PPDVYLLDGS PCARGSGYC WDGAC PTLEQQCQQLWGP 
GSHPAPEACFQWNSAGDAHGNCGQDSEGHFLPCAGR 
DALCGKLQCQGGKPSLLAPHMVPVDSTVHLDGQEVTC 
RGAIiALPSAQLDLLGLGLVEPGTQCGPRMVCQSRRCR 
KNAFQELQRCLTACHSHGVCNSNHNCHCAPGWAPPFC 
nTfprjpnn^Mn^GPVOAEMHDTFLLAMLLSVLIjPLLPG 
AGLAWCCYRLPGAHLQRCSWGCRRDPACSGPKDGPHR 
DHPLGGVHPMELGPTATGQPWPLDPENSHEPSSHPEK 
PLPAVSPDPQADQVQMPRSCLW 


9919 


A 


1073 


480 


XXPDALSTVAEXPGRPTRPPTRTAAPWPRPGCSSASA 
PPT PAS APWPAS PS S S SGRWSTDSRG PRPWEGSQGCW 
HCGSW*RT*CTCKIIGGPGSRGCAASSSWASSSRPSP 
SLPSAPSSCWPSPGIRASQTPPATTSPASGASFPSSG 
PSCSASMPTATGLTLLTSASSAI SDPGGSVYA* SGMV 
HQSGKEPSTVYTS 


2213 


A 


1 


2454 


MALQNALYTGDLARLQELFPPHSTADLLLESRAAEPR 
WSSHQRACPIAYTLAQEHSHVEPRIAPAGCVARLVEK 
PSRGSEEHLKSGPGPIVTRTASGPALAFWQAVLAGDV 
GCVSRILADSSTGLAPDSVFDTSDPERWRDFRFNIRA 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possible nucleotide 
insertion) 










LRLWS LT YE EELTTPLHVAAS RGHTE VLRLLLRRRAR 
PDSAPGGRTALHEACAAGHTACVHVLLVAGADPNIAD 
QDGKRPLHLCRGPGTLECAELLLRFGARVDGRSEEEE 
ETPLHVAARLGHVE LADLLLRRGACPDARNAEGWT PL 
LAACDVRCQS I TDAEATTARC LQLCS LLLSAGADADA 
ADQDKQRPLHIiACRRGHAAWELLLSCGVSANTMDYG 
GHTPLHCALQGP AAALAQS PEHWRALLNHGAVRVW P 
GALPKVLERWSTCPRTIEVLMNTYSWQLPEEAVGLV 
TPETLQKHQRFYSSLFALVRQPRSLQHLSRCALRSHL 
T^n^T.pnaT.PRTjPliPPRLLRYIiOLDFEGVSPGICEOSO 
LLGVQGCVEGKRRVGEGPSQNRPVPEPPEASESKPLL 
PDVHGLLRGPESRCPSLQRARLCTNSGQVALAAGGPA 
PQAGVDAAI PNAEKRTDSGSRPFQGLLRSGTAHGGKD 
CPPGPHQVRLAGSRSAAHRRKRQLCAAATRGHPRPGP 
TLPTMRGLS LANEWIGAS FAGRLTNTFCAGLGQAVPS 
MVALTTALPS FAE P PDAFYGPQ ELAAAAAAAAATAAR 
NNPEPGGRRPEGGLEADELLPAREKVAEPPPPPPPHF 
SETFPSLPGVDKLQGWDFRGHQDGGMLKQLSIQQWRA 
RSGF 


2214 


A 


757 


208 


NVFIEPRIQGFMKTSAHPGQKHPDFSMGLLFPLLAAL 
EVCSCGSSGSLGYNLPQNH\GLLGRNTLVLLGQMRRI 
SPFLCLKDRSDFRFPQEKVEVSQLQKA\QAMSFLYDV 
LQQVFNFSHKALL\CCMEHDLPGPTPHFTSSAAGTPG 
DLLGAGDGRRRS WGQWVI EGS TIjALRRYFQES I STLE 


2215 


A 


43 


1004 


QLWGFAAGSDSRPAMGCDGGTI PKRHELVKGPKKVEK 
VDKDAELVAQWNYCTLSQEILRRPIVACELGRLYNKD 
AVIEFLLDKS AEKALGKAASHI KS I KNVTELKLSDNP 
AWEGDKGNTKGDKHDDLQRARF I C P WGLEMNGRHRF 
CFLRCCGCVFSERALKEIKAEVCHTCGAAFQEDDVIV 
LNGTKEDVDVLKTRMEERRLRAKLEKKTKKPKAAESV 
S KPD VS EE APGPS KVKTGKPEE AS LD SRE KKTNLAPK 
STAMNESSSGKAGKPPCGATKRSIADSEESEAYKSLF 
TTHS S AKRS KEE S AHWVTHTS YC F 


2216 


A 


1323 


840 


FC PLGKPVMGPI FLDCRPFFLF PKPNQGTGTPLHNKV 
PYFFQ*GPFGPLWNHRTLFFFLRWSFALLAQAGVQWR 
DLGSIjQPLPPGFK*FSCLSLPSIWDYRRLPPCPANFA 
FLVETGFLHVGQVGL*LLTSGDPSASASQSSGITG\V 
SHHTWP*LSFLLWI 


2217 


A 


17 


348 


ARAAARAGFSSYLKSLPDVRKKSLPLPEKPHKEENSE 
I VWRE FDKQVFLLN * S PRRQSKLYTVDLESGLHYLL 
RVEIiAAHKS LAGAELKTLKDFVTVLAKLF PGRP PVK 


2218 


A 


1 


1206 


MALSSWPVVLRLNMADFVFSFLCLGIGTSIVLGILFY 
LLQAHRYLQEGMTYQLALS F YLTWAS VFLFLMTGMGE 
DEESALQTLLDPRSSYLLVSLEILPTNPSPLSPCAVS 
EDESEMRGLSLLRRQSQATGRLEPTFKHDSTLLALQG 
ALGLYDGHTPPYAACLGFEFRKHLGNPAKDGGNVTVS 
LFYRNDSAHLPLPLSLPGCPAPCPLGRFYQLTAPARP 
PAHGVSCHG P YEAVI P PGPGAI I PSTGPAVGMQRERS 
EVGSGVPARTVYASEQHAYMWHSALI PDSGLRGKPTL 
S SRKPPQTS CG PE FANVLS LALCGALVVCKARAMDQA 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deIetion,=possible nucleotide 
insertion) 










RPRQLIGIDALRDPRASSRTRAGGLGMIRRQEEEPAA 
RTVLARCDSSPSECPSHARAPYDTGPLFNAKG 


2219 


A 


1 


1594 


NGGGSLNNYSSPIPSTPAPSRRDPQFSVPPTANTPTP 
VCKRSMRWSNLFTSEKGSDPDKERKAPENHADTIGSG 
RAIPIKQGMliLKRSGKWLKTWKKKYVTIiCSNGMLTYY 
SSLGDYMKNIHKKEIDLQTSTIKVPGKWPSLATSACT 
PISSSKSNGLSKDMDTGLGDSICFSPSISSTTSPKLN 
PPPSPHANKKKHLKKKSTNNFMIVSATGQTWHFEATT 
YEERDAWVQAIQSQILASLQSCESSKSKSQLTSQSEA 
MALQS IQNMRGNAHCVDCETQNPKWAS LNLGVLMC I E 
CSGIHRSLGPHLSRVRSLELDDWPVELRKVMSSIVND 
LANS IWEGS SQGQTKPSEKSTREE KERWI RS KYEEKL 
FLAPLPCTELSLGQQLLRATADEDLQTAI LLLAHGS C 
EEVNETCGEGDGCTALHLACRKGNVVLAQLLIWYGVD 
VMARDAHGNTALTYARQAS SQECI NVLLQ YGC PDECV 
*YLFYLTAVSLVQKQNGKNKDNSEFQKEITNSANNSI 
FSTFRKLSKYTKC 


2220 


A 


1 


1594 


NGGGSLNNYSSPIPSTPAPSRRDPQFSVPPTANTPTP 
VCKRSMRWSNLFTSEKGSDPDKERKAPENHADTIGSG 
RAI PI KQGMLLKRSGKWLKTWKKKYVTLC SNGMLTYY 
S S LGD YMKN I HKKE I DLQTS T I KVPGKWPS LATS ACT 
P I S S S KSNGLS KDMDTGLGDS ICFSPSISS TTS PKLN 
PPPSPHANKKKHLKKKSTNNFMIVSATGQTWHFEATT 
YEERDAWVQAIQSQILASLQSCESSKSKSQLTSQSEA 
MALQS I QNMRGNAHCVDCETQNPKWAS LNLGVLMC IE 
CSGIHRSLGPHLSRVRSLELDDWPVELRKVMSSIVND 
LANS IWEGS SQGQTKPSEKSTREEKERWIRSKYEEKL 
FLAPLPCTELSLGQQLLRATADEDLQTAI LLLAHGS C 
EEVNETCGEGDGCTALHLACRKGNWLAQLL I WYGVD 
VMARDAHGNTALTYARQAS SQECINVLLQYGCPDECV 
* YLF YLTAVS LVQKQNGKNKDNS E FQKE I TNS ANNS I 
FSTFRKLSKYTKC 


2221 


A 


1 


1594 


NGGGSLNNYSSPIPSTPAPSRRDPQFSVPPTANTPTP 
VCKRSMRWSNLFTSEKGSDPDKERKAPENHADTIGSG 
RAI P I KQGMLL KRS GKWL KT WKKKYVTLC SNGMLTYY 
SSLGDYMKNIHKKEIDLQTSTIKVPGKWPSLATSACT 
PISSSKSNGLSKDMDTGLGDSICFSPSISSTTSPKLN 
PPPSPHANKKKHLKKKSTNNFMIVSATGQTWHFEATT 
YEERDAWVQAIQSQILASLQSCESSKSKSQLTSQSEA 
MALQS IQNMRGNAHCVDCETQN PKWAS LNLGVLMC I E 
CSGIHRSLGPHLSRVRSLELDDWPVELRKVMSSIVND 
LANS IWEGS SQGQTKPSEKSTREEKERWIRSKYEEKL 
FLAPLPCTELSLGQQLLRATADEDLQTAI LLLAHGS C 
EEVNETCGEGDGCTALHLACRKGNWLAQLL I WYGVD 
VMARDAHGNTALTYARQAS SQECI NVLLQ YGC PDECV 
* YLF YLTAVSLVQKQNGKNKDNS E FQKE I TNS ANNS I 
FSTFRKLSKYTKC 


2222 


A 


1 


1594 


NGGGSLNNYSSPIPSTPAPSRRDPQFSVPPTANTPTP 
VCKRSMRWSNLFTSEKGSDPDKERKAPENHADTIGSG 
RAI PI KQGMLLKRSGKWLKTWKICKYVTLC SNGMLTYY 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possibIe nucleotide 
insertion) 










SSLGDYMKNIHKKEIDLQTSTIKVPGKWPSLATSACT 
PISSSKSNGLSKDMDTGLGDSICFSPSISSTTSPKLN 
P P PS PHANKKjQiLKKKSTNNFMI vsatgqt WHFEATT 
YEERDAWVQAIQSQILASLQSCESSKSKSQLTSQSEA 
MALQSIQNMRGNAHCVDCETQNPKWASLNLGVLMCIE 
CSGIHRSLGPHLSRVRSLELDDWPVELRKVMSSIVND 
JjANblWhLiboUCay 1 K.PbfctKb lKhfcil^RWIKbKYkhKJ_i 
FLAPL PCTE LSLGQQLLRATADEDLQTAI LLLAHGS C 
EEVNETCGEGDGCTALHLACRKGNWLAQLLIWYGVD 
T/Macnaur'M'paT r rvAi?riziCQn"E , r , TKrcrr t rivpPDPcn/' 

vJYl/\KiJx\n.olN X niil Xi\K^r\oo<^.Eil~XjN v ±jL»\^X\3\^irLjCt\* V 

*YLiFYIiTAVSLVQKQNGKNKDNSEFQKEITNSANNSI 
FSTFRKLSKYTKC 


2223 


A 


2 


415 


GGFAAAVES FHHEDVLLFAALMAHELGHNIiGIQHDHS 

KGACLFNKPRPRGRKRRDSACGNGWEDTDQCDCGSL 
CQHHACCDENC I LKAKA* CNDGPCCHK 


2224 


A 


53 


325 


MRLSVCLLLLTLALCCYRANAWCQALGSEITGFLLA 
GKPVFKFQLAKFKAPLEAVAAKMEVKKCVDTMAYEKR 
VLITKTLGKIAEKCDR* 


2225 


A 


9 


422 


ESRERSGNRRGAEDRGTCGLQS PSAMLGAKPHWLPGP 
lino FCjXjFljVljVijJaAJjVjALjW^ 

E PGRAAAGG PGGAALGEAP PGRVAFAAVRS HHHE PAG 
ETGNGTSGAIYFDQVLVNEGGGFDRAS 


2226 


A 


42 


722 


MGCDGRVSGLLRRNLQPTLTYWSVFFSFGLCIAFLGP 
TLLDLRCQTHSSLPQISWVFFSQQLCLLLGSALGGVF 
KRTLAQSLWALFTS SLAI SLVFAVI PFCRDVKVLAS V 
MALAGLAMGC I DTVANMQLVRMYQKDS AVFLQVLHF F 

SMSPGCWGQHHVDAQALVQPDVPKADSQGPGREPEGP 
MPSG* 


2227 


A 


42 


722 


MGCIX3RVSGLLRRNLQPTLTYWSVFFSFGLCIAFLGP 
TLLDLRCQTHSSLPQISWVFFSQQLCLLLGSALGGVF 

VTJTT anCT MBT.CTCOT 2V T CT T/CAWT DUPD'nVTrUT.fc. Q\T 
JSivlXiAybXjWivur XooijAlbljVrAVXr'rUKUVJXVijAo V 

MALAGLAMGC I DTVANMQLVRMYQKDSAVFLQVLHF F 
VGFGALLSPLIADPFIiSEANCLPANSTGQHHLPRATC 
SMSPGCWGQHHVDAQALVQPDVPKADSQGPGREPEGP 
MPSG* 


2228 


A 


2 


474 


TGPTIKNMDGTFNVTSCLKLNSSQEDPGTVYQCWRH 

ASLHTPLRSNFTLTAARHSLSETEKTDNFSIHWWPIS 

F IGVGLVLLI VL I PWKKI CNKS S S AYTPLKC I LKHWN 

<?T7nTOTTiK'K'RHT,TT7PPTP AWPGYOT lODflRAWPPROSV 
DrJJiy i ijivivcirixj jl rrv* i.i\_rvvv xr o iyuyiL/oaAM it it v 

NINTYSTTV 


2229 


A 


2 


1654 


GRGDSSSSGSGSGSGSGSRACPARPSAPGLRAPTPPP 
RLPGASGAPAARLTLKFLAVLIiAAGMLAFLGAVICI I 
ASVPLAASPARALPGGADNASVASGAAASPGPQRSLS 
ALHGAGGSAGPPALPGAPAASAHPLPPGPLFSRFLCT 
PLAAAC PSG AQQGDAAGAAPGEREELLLLQSTAEQLR 
QTALQQEARIRADQDTIRELTGKLGRCESGLPRGIjQG 
AGPRRDTMADGPWDS PALI LELEDAVRALRDRI DRLE 
ELPARVNLSAAPAPVSAVPTGLHSKMDQLEGQLLAQV 
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644 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 

At nanf i<Ia 

oi pepiiae 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

rani flu & nf 

peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










LALEKERVALSHS SRRQRQEVEKELDVLQGRVAELEH 
GS SAYS PPDAFKI S I P I RNNYMYARVRKAL PEL YAFT 
ACMWLRSRS SGTGQGTPFS YSVPG/QAGNE I VLLEAG 
HEPMELLINDKVAQLPLSLKDNGWHHICIAWTTRDGL 
WSAYQDGELQGSGENLAAWHP I KPHGILI LGQEQDTL 
GGRFDATQAFVGDI AQ FNLWDHALTPAQVLG I ANCTA 
pt .t ,nwvT .PWPnK' t a yfifia tk 1 a a RnvPTCOR a k a 

if xjxjvj IN VUrrH EiUIVxj V £irvx VjVji-V J. J\rlrir J-» V V^INAJlvruVrl. 


2230 


A 


3 


913 


FMTDVNSWLLTFGFQLHNVI PGYPKPDMDAMEPSYEL 
I HTQMKTQE WDNS KS I LGVQCEVQKQLKAFVTLERFD 
QLYGSTITSCQQAPKTKKFASSGSVFGKGVKFALKDG 
RVTTDT T c; V AKTRnfiRR VA ATT ,TJT4 AHYT iRNTjHPTT DGV 
DTH YFVKPGPS EGDLAI LGLSGGRRTLENGVNVTVSQ 
I NTVLNGRTRR YTD I OLO YGALCLNTR YGTTLDEEKA 

X AH X V J II 11 VJXV X XVXV X X l-> X» ^ J— X VjniJ * mi X l\ X V-J -L. J. i II./ UUiUi 

RVLEliSRQRAVRQAWAREQQRLREGEEGLRAWTEGEK 
QQVLSTGRVQGYDGFFVISVEQYPELSDSANNIHFMR 
QSEMGRR 


2231 


A 


488 


75 


ASVPKTNKIEPRSYSIIPSCGIQAARACFEHSNFFKV 
MA citt p AHH CI A K <Z T PR A PR OICC1R RR AVART iA ADR PP A P 
KIQLRAF * LQQL * YTLLELEL PRLLAPDL PSNGS S LK 
DLKWTHSNYRASKESCI VI FRHYLPGS 


2232 


A 


3 


181 


HERDVL FNLCENLVKS S EANS PAHEE FKTMLL I AHYY 
ATRS AAE S VYQL * AVSRVLLS LVY 






A. 


ft z? 6 


fTT KATCWT .TMVDT .P^ T FT .fiT^TTjTiVWVfiVI R YIiGYFOA 
YNVLI LTMQAS L PKVLRFCACAGMI YLGYT FCGWI VL 
GPYHDKFENIxNTVAECLFSLVNGDDMFATFAQIQQKS 
ILWLFSRLYLYSFISLFIYMILSLFIALITDSYDTI 
KKFQQNGFPETDLQEF 


2234 


A 


1 


492 


KIKAKNLxWDLCSIFLGTSTLLVWVGVIRYLGYFQA 
YMVTjT LTMOASL PKVLRFCAC AGMI YLGYTFCGWI VL 
GPYHDKFENLNTVAECLFS LVNGDDMFAT FAQIQQKS 
ILVWLFSRLYLYSFISLFIYMILSLFIALITDSYDTI 
KKFQQNGFPETDLQEF 


<4^J -3 




1 

x, 


D / D 


prnFT?WT-TSS /OKATPAEEVEDSNDSSYSEPPDVOOOL 
NHYQS AALARNNS RVS PVPLS GAAAGTEQKTEAVLHC 
EFCEFSSGYIQSIRRHYRDKHGGKKLFKCKDCSFYTG 
FKSAFTMHVEAGHSAVPEEGPKDLRCPLCLYHTKYKR 
MMIDHIVLHREERWPIEVCRSKLSKYLOGVVFRCDK 
CTFTCSR 






D U 




MPT.TjRYARNMLRTWSSLPWTRFRVCLLSLSLFLWANR 
LEDSRSCQPNPMSLTTLPGHRLKEAVWLPAPSRTMSP 
HLDPNQLGI LLRVLRKEKEDGD YPDMMATHPSSRYEA 
CSSGITLAAPPTHGPRPTDPRIGPAP 


2237 


c 


60 


472 


MPLLEYARNMLRTWSSLPWTRFRVCLLSLSLFLWANR 
LEDSRSCQPNPMSLTTLPGHRLKEAVWLPAPSRTMSP 
HLDPNQLGI LLRVLRKEKEDGDYPDMMATHPS SRYEA 
CSSGITLAAPPTHGPRPTDPRIGPAP 


2238 


A 


129 


329 


VSNIVDPHQTVGLSTQEPGDIFTYSEFDGILGLAYPS 
LASE * S VPVLDNTMQRHLVAQDLFSVYMSR 


2239 


A 


130 


502 


DSRI PKEAPDQQIOCKMGPPSLVLCLLSATVFSLLGGS 
S AFLSHHRLKGRFQRDRRNI RPNI I LVLTDDQDVELG 
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645 
TABLE 7 



SEQ 
11) 


Method 


Predicted 

nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 

nucleotide 
location of 
last amino 
acid 

res luu e oi 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=nossihle nucleotide deletion — nossible nucleotide 
insertion) 










SMQVMNKTRRIMEQGGAHFINAFVTTPMCCPSRSSIL 
TGKYVHNHNTYMY 


2240 


A 


3 


498 


WDTnTTnuCT^WVPTTJDT VVT VT CnDOPMDVIfTTMM 
iivbi VV lyrlr i-i* V X Xr» inirX X X XiJxXov* ovjiN Jr ISJxX X rim 

DCGIHAREWIAPAPCQWFVKEILQNHKDNSRIRKLLM 
NLDF YVLP VLNI DGYI YTWTTDRLWRKSRS PHNNGTC 
FGTDLNRNFNASWCSIGASRNCQDQTFCGTGPVSEPE 
TKAVAS F I E S KNDD FCA 


2241 


A 


3 


498 


vvD\nrpnut?T *TrrvT7TUDTWT.ifT cnDcnxTDVifT TWM 
xJ&JS VV iynr Li K V ± ±ri IttkrJL X X JjJVXov -trOVjIM t?uu\j. X m »i 

DCGI HAREW I APAFCQWFVKE I LQNHKDNSRI RKLLM 

NLD F YVL PVLN I DGYI YTWTTDRLWRKSRS PHNNGTC 

FGTDLNRNFNASWCSIGASRNCQDQTFCGTGPVSEPE 

TKAVASFIESKNDDFCA 


2242 


A 


972 


468 


MAAAGAGRLRRVAS ALLLRS PRLPARBLS APARLYHK 

JN.V V L/n X CilN IT IvLM V ^jOXjUIN- J- O XvLN V OX VJXJ V Orl.E\rlv*V7J~' V I'llv 

LQI QVDEKGKI VDARF KTFGCGS AI AS S S LATEWVKG 
KTVEEALT I KNTD I AKELCL P PVKLHCSMLAEDAI KA 
ALADYKLKQEPKKGEAEKK 


2243 


A 


1193 


548 


TQAWTRAEKDRKGSVRALRLHLERGPPT*RGSHPL\Q 
SVPCIQKPSIFSSYPI/GLPQSGGEPGPVGEQQPVRR 
PEQPSCGPASRMPLTSRSVPPGRGALPPDSLSTRKGL 

dd p c t a r* up \ ro T? Q rJUTTW PX7 Q n R T .NT . P VMd S NT iO P 

PRKVAVPGPTR*RDQDSKQDFSSKPLQSVPGLASTQQ 
TLT PADSG PGTGGRDATRAGL PGVETMGNGVD 


2244 


A 


3 


773 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRST 
PAMMNGQGSTTSSSKNIAYNCCWDQCQACFNSSPDLA 
DHIRSIHVDGQRGGVFVCLWKGCKVYNTPSTSQSWLQ 
RHMLTHSGDKPFKCWGGCNASFASQGGLARHVPTHF 
SQQNS S KVS SQPKAKEES PS KAGMNKRRKLKNKRRRS 
LARPHDFFDAQTLDAIRHRAICFNLSAHIESLGKGHS 
WFHSTVS I LLFFQI KYKTLQKNI STI I SKSLKI 


2245 


A 


3834 


2068 


GARGRPLAETWPFLTAPVLPGQIiQITEPTMAEKGDCI 
ASVYGYDLGGRFVDFQPLGFGVNGLVLSAVDSRACRK 
VAVKKI ALSDARSMKHALRE I KI I RRLDHDNI VKVYE 
VLGPKGTDLQGELFKFSVAYIVQEYMETDLARLLEQG 
TLAE EHAKLFMYQLLRGLKY I HS ANVLHRDLKPAN I F 
I STEDLVLKIGDFGLARIVDQHYS\HKGY1jSEGLVTK 
WYRS PRLLLS PNNYTKAI DMWAAGC I LAEMLTGRMLF 

AvjAri Hi Li o vJiu^J L)X Li Hi 1 lr V xixuCiUiNX/OJJUiv vntrsjc voa 

TWEVKRPLRKLLPEVNSEAIDFLEKILTFNPMDRLTA 
EMGIjQHPYMSPYSCPEDEPTSQHPFRIEDEIDDIVLM 
AANQSQLSNWDTCSSRYPVSLSSDLEWRPDRCQDASE 
VQRDPRAGSAPLAENVQVDPRKDSHSSSERFLEQSHS 
SMERAFEADYGRSCDYKVGS PSYLDKLLWRDNKPHHY 
SEPKLILDLSHWKQAAGAPPTATG\LADTGAREDEPA 
SLFLE\IAQWVKSTQG\AQSTPARPPTTPSAACLPRP 
P\PPGPGGCR\RQPPVRPGRVHLPRPEALHQARGPAG 

Q 


2246 


A 


328 


595 


VIEWWPVEPPNQLSTSSVGRVPGSTRPQRSFLSRW 
RAALPLQLLLLLLLLLACLLPSSEEDYSCTQANNFAR 
SFYPMLRYTNGPPPT 
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646 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, =possib!e nucleotide 
insertion) 


2247 


A 


548 


811 


SSFIKRHILIFEDDWHQTTCCHHPHHP\F*RCQFHIF 
YVS VQNS I S PSLS VS SSHPDR PDHEVHQHRAAHHHQH 
GQGPLGHGLVARVG 


2248 


A 


37 


441 


GXAGVGGDSEGEVTSALSATFSGPKIAFYVGLKSPHE 
GYEVLKFDDWTNLGNHYDPTTGKFSCQVRGIYFFTY 
HI LMRGGDGTSMWADLCKNGQVRASAI AQDADQNYDY 
ASNSWLHLDSGDEVYVKLDGGKA 


2249 


A 


808 


112 


RRYKSGTEVNNTDGGIARLIVFGTGQKDWTATDPKEP 
ADLVAI AFGGVCVGFSNAKFGHPNNI I GVGGAKSMAD 
GWETARRLDRPPI LENDENGI LLVPGCEWAVFRLAHP 
GVI TR I E I DTKYFEGNAPD S C KVDGC I LTTQEE AVI R 
QKWILPAHKWKPLLPVTKLS PNQSHLFDSLTLELQDV 
ITHARLTI VPDGGVNRLRLRGFPSS I CLLRPREKPML 
KFSVSFKANP 


2250 


A 


189 


1811 


PPFGGLSAAQTIGEMWEAQFLGLLFLQPLWVAPVKPL 
QPGAEVPWWAQEGAPAQLPCSPTIPLQDLSLLRRAG 
VTWQHQPDSGPPAAAPGHPLAPGPHPAAPSSWGPRPR 
RYTVLSVGPGGLRSGRLPLQPRVQLDERGRQRGDFSL 
WLRPARRADAGE YRAAVHLRDRALS CRLRLRLGQASM 
TASPPGS LRASDWVI LNCS FS RPDRPAS VHWFRNRGQ 
GRVPVRES PHHHLAE S FLFLPQVS PMDSGPWGC I LTY 
RDGFNVS I MYNLTVLGLE PPTPLTVYAGAGSRVGLPC 
RLPAGVGTRS FLTAKWTPPGGGPDLLVTGDNGDFTLR 
LEDVSQAQAGTYTCHI HLQEQQLNATVTLAI I TVTPK 
SFGSPGSLGKLLCEVTPVSGQERFVWSSLDTPSQRSF 
SGPWLEAQEAQLLSQPWQCQLYQGERLLGAAVYFTEL 
S S PGAQRSGRAPGAL PAGHLLLFLTLGVLSLLLLVTG 
TFGFHLWRRQCRP\RRFSALEQGIH\P\RQAQSKIEE 
LEQEPEPEPEPEPEPEPEPEPEQL 


2251 


A 


3 


3773 


SWPRGRGETGGHPGALRTRTMQKSVRYNEGHALYLAF 
LARKEGTKRGFLSKKTAEASRWHEKWFALYQNVLFYF 
EGEQSCRPAGMYLLEGCSCERTPAPPRAGAGQGGVRD 
ALDKQYYFTVLFGHEGQKPLELRCEEEQDGKEWMEAI 
HQASYADILIEREVLMQKYIHLVQIVETEKIAANQLR 
HQLEDQDTEIERLKSEIIALNKTKERMRPYQSNQEDE 
D PD I KKI KKVQS FMRGWLCRRKWKTI VQD YI CS PHAE 
SMRKRNQ I VFTMVEAE S E YVHQL YI L VNGFLRPLRMA 
ASSKKPPISHDDVSSIFLNSETIMFLHEIFHQGLKAR 
IANWPTLILADLFDILLPMLNIYQEFVRNHQYSLQVL 
ANCKQNRDFDKLLKQYEANPACEGRMLETFLTYPMFQ 

VMHDEVSDTENIRKNLAIERMIVEGCDILIiDTSQTFI 
RQGSLIQVPSVERGKLSKVRLGSLSLKKEGERQCFLF 
TKHFLI CTRSSGGKLHLLKTGGVLS LI DCTLI EE PDA 
SDDDSKGSGQVFGHLDFKIWEPPDRAAFTWLLAPS 
RQEKAAWMSDI SQCVDNIRCNGLMTI VFEENSKVTVP 
HMIKSDARLHKDDTDICFSKTLNSCKVPQIRYASVER 
LLERLTDLRFLS I DFLNTF'LHT YRI FTTAAWLGKLS 
DI YKRPFTS I PVRSLELFFATSQNNRGEHLVDGKS PR 
LCRKFSSPPPLAVSRTSSPVRARKLSLTSPLNSKIGA 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possib!e nucleotide 
insertion) 










LDLTTS S S PTTTTQS P AAS P P PHTGQ I PLDLSRGLS S 
PEQS PGTVE ENVDNPRVDLCNKLKRS I QKAVLE S APA 
DRAGVESSPAADTTELSPCRSPSTPRHLRYRQPGGQT 

7\ TMtTTV tTf OTTO T"»7\ £?5\DJ\T7\ ""PA * jy ntrri O TJTJOCVTKT'T't? DTPn 

ADNAHCo Vo PAbAr AI ATAAAGHGo PPUr rJN I KK J. L.JJ 
KEFI IRRTATNRVLNVLRHWVSKHAQDFELNNELKMN 
VLNLLEEVLRDPDLLPQERKAAANI LMALSQDDQDDI 
HLKLEDIIQMTDCMKAECFESLSAMEIjAEQITLIiDHV 
I FRSI PYEEFLGQGWMKLDKNERTPYIMKTSQHFNDM 
SNLVASQIMNYADVSSRANAI EKWVAVADICRCLHNY 
NGVLEI TS ALNRS AI YRLKKTWAKVS KQTKALMDKLQ 
KTVSSEGRFKNLRETLKNCNPPAVPYLGMYLTDIjAFI 
EEGTPNFTEEGLVNFSKMRMISHIIREIRQFQQTSYR 
IDHQPKVAQYLLDKDLIIDEDTLYELSLKIEPRLPA 


2252 


A 

) 


1 

* 


4602 


ASGNLDKNARFSAIYRQDSNKLSNDDMLKLLADFRKP 

EKMAKLPVILGNLDITIDNVSSDFPNYVNSSYIPTKQ 

FETCS KTPIT FEVEE FVPC I PKHTQ P YT I YTNHLYVY 

PKYLKYDSQKSFAKARNIAICIEFKDSDEEDSQPLKC 

I YGRPGGPVFTRSAFAAVLHHHQNPEFYDEI KI EL PT 

QLHEKHHLLLTFFHVSCDNSSKGSTKKRDWETQVGY 

SWLPLLKDGRWTSEQHIPVSANLPSGYLGYQELGMG 

RHYGPE I KWVDGGKPLLKI STHLVSTVYTQDQHLHNF 

FQYCQKTESGAQALGNELVKYLKSLHAMEGHVMIAFL 

PTILNQLFRVLTRATQEEVAVNVTRVI IHWAQCHEE 

GLESHLRS YVKYAYKAE P YVASE YKTVHEELTKSMTT 

I LKPS AD FLTSNKLLKYS WFF FDVLI KSMAQHL I ENS 

KVKLLRNQRFPASYHHAVETWNMLMPHITQKFRDNP 

EASKNANHSLAVFIKRCFTFMDRGFVFKQINNYISCF 

APGDPKTLFEYKFEFLRWCNHEHYI PLNLPMPFGKG 

RIQRYQDLQLDYSLTDEFCRNHFLVGLLLREVGTALQ 

EFREVRLIAI S VLKNLLI KHS FDDRYASRSHQARI AT 

LYLPLFGLLIENVQRINVRDVSPFPVNAGMTVKDESL 

ALPAVNPLVTPQKGSTLDNSLHKDLLGAI SGI AS PYT 

TSTPNINSVRNADSRGSLI STDSGNSLPERNSEKSNS 

LDKHQQS STLGNS WRCDKLDQS E I KSLLMC FLY I LK 

SMSDDALFTYWNKASTSELMDFFTI SEVCLHQFQYMG 

KRYIARTGMMHARLQQLGSLDNSLTFNHSYGHSDADV 

LHQSLLEANIATEVCLTALDTLSLFTLAFKNQLLADH 

GHNPLMKKVFDVYLCFLQKHQSETALKNVFTALRSLI 

YKFPSTFYEGRADMGAALCYEILKCCNSKLSSIRTEA 

SQLLYFLMRNNFDYTGKKS FVRTHLQVI I SVSQLI AD 

WfJTnnTR pnn ^T.^TTNNPATJ^nRl^TKHTSFSSDVKD 
V VvjIvAj i. t\r ^OUij JL J.lNlNV_^\iMOi-/*\.jJX xvn x jr v iuj 

LTKRI RTVLMATAQMKEHEND PEMLVDLQYS LAKS YA 
STPELRKTWLDSMARIHVKNGDLSEAAMCYVHVTALV 
AEYLTRKEAVQWEPPLLPHSHSACLRRSRGGVFRQGC 
TAFRVITPNIDEEASMMEDVGMQDVHFNEDVLMELLE 
QCADGLWKAERYELI ADI YKLI I PI YEKRRDFERLAH 
LYDTLHRAYSKVTEVMHSGRRLLGTYFRVAFFGQAAQ 
YQFTDSETDVEGFFEDEDGKEYIYKEPKLTPLSEISQ 
RLLKLYSDKFGSENVKMIQDSGKVNPKDLDSKYAYIQ 
VTHVI PFFDEKELQERKTEFERSHNIRRFMFEMPFTQ 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
lact amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










TGKRQGGVE EQC KRRTI LTAI HC F P YVKKRI PVMYQH 
HTDLNPIEVAIDEMSKKVAELRQLCSSAEVDMIKLQIi 
KLQGSVSVQVNAGPLAYARAFLDDTNTKRYPDNKVKL 
TiKEVFROFVEACGOALAVNERLI KEDOLEYOEEMKAN 
YREMAKELSEIMHEQLG 


2253 


A 


1 


782 


MRME AGE AAP PAGAGGRAAGG WGKWVRLNVGGTVFLT 
TRQTLCREQKSFLSRLCQGEELQSDRDETGAYLIDRD 
PTYFGPILNFLRHGKLVLDKDMAEEGVLEEAEFYNIG 
PLI RI I KDRMEEKDYTVTQVP PKHVYRVLQCQEEELT 
QMVSTMSDGWRFEQIjVNIGSSYNYGSEDQAEFLCWS 
KELHSTPNGIjSSESSRKTKSTEEQLEEQQQQEEEVEE 

veveqvqveadaqek/ccykpeapgceapdhlqglgv 
PI 


2254 


A 


2407 


2216 


SGC vemlyshsleynpewi s vqsavapaqlalnsdgd 
l*lhsgertrrd*qlpeaggpglqeplqlgelditsd 
e f i lde vdg\ vdlrhys kqvele lqqi eqks i rdyiq 
eseni aslhnqitacdavlermeqmlgafqsdlss i s 
s e i rtlqeqsgamni rlrnrqavrgklgelvdglwp 
salvtai leapvteprfleqlqeldakaaavreqear 
gtaacadvrgvldrlrvkavtki re f i lqki ys frkp 
mtnyqi pqtallkyrff yqfllgneratakei rdeyv 
etlski yls yyrs ylgrlmkvq yeevaekddlmgved 
takkgffskpslrsrntiftlgtrgsvispteleapi 

IiVPHTAORGEORYPFEALFRSOHYALLDNSCREYLFI 
CEFFWSGPAAHDLFHAVMGRTLSMTLKHLDSYLADC 
YDAIAVFLCIHIVLRFRNIAAKRDVPALDRYWEQVLA 
LLWPRFELILEMNVQS VRSTD PQRLGGLDTRPHYI TR 
RYAEFSSALVSINQTI PNERTMQLLGQLQVEVENFVL 
RVAAEFSSRKEQLVFLINNYDMMLGVLM\E*ERAADD 
SKEVESFQQLLNARTQEFIEELLSPPFGGLVAFVKEA 
EALIERGQAERLRGEEARVTQLIRGFGSSWKSSVESL 
SQDVMRSFTNFRNGTSI IQGALTQLIQ\LYHRFHRV\ 
LSOPOLRALPARAELINIHHLMVEIjKKHKPNF 


2255 


A 


1205 


462 


ASITVSSGRIPTSLSVGPPGAPLHRPQKPREGAWDME 

dvaptgvrqafselpfpshvlpepgfpdtdpsqvysp 
glppapaqpssippcalvsqptvqfilqgslplvgcg 
aaqtiiapvpaaltpasepasqataasnseektpaprl 

AAEKTKKEEYMKKLHMOERAVEEVKLAIKPFYQKREV 
TKEE YKD I LRKAVQ KI CHSKS GE I NPVKVANLVKAYV 
DKYRHMRRHKKPEAGEEPPTQGAEG 


2256 


A 


1205 


462 


ASITVSSGRIPTSLSVGPPGAPLHRPQKPREGAWDME 
DVAPTGVRQAFSELPFPSHVLPEPGFPDTDPSQVYSP 
GLPPAPAQPSSIPPCALVSQPTVQFILQGSLPLVGCG 
AAQTLAPVPAALTPAS E PASQATAASNS EEKTPAPRL 
AAEKTKKEEYMKKLHMQERAVEEVKLAIKPFYQKREV 
TKEEYKDILRKAVQKICHSKSGEINPVKVANLVKAYV 
DKYRHMRRHKKPEAGEEPPTQGAEG 


2257 


A 


901 


521 


FFFGNGVSPCRQAGV*WHDLDSLQNLPPGFKRFSYLS 
LPS S W\ D YRHVL PRQANFC I F /M * RRGFTMLARMVS I 
S * PRDLPALASQSAGITGVSHHAPPQMDFTFALLCFA 
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TABLE 7 



ID 


iVlClJUUU 


beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


X I CUlVlvU 

ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino arirl ^pnnpnrt* f"V=T In isr\ nwn *=Ston codOD. 

/^possible nucleotide deletion,=possible nucleotide 
insertion) 










LKGCLPRQKEGGTIiNLI 


2258 


A 


186 


1338 


TRMSRHEGVSCDACLKGNFRGRRYKCLICYDYDLCAS 

r , VI?or , 7V r P r P'T , 'DlJ t T ,, T , r\lJT3MOr , TT TDTTnCHT VVPPI! 1 ADC\7 

LYhbCaAl I iKHJ. i DHFMyL.±JjIKVUr UIjx iLstirjAr ov 

EQPQSFTCPYCGKMGYTETSLQEHVTSEHAETSTEVI 
CPICAALPGGDPNHVTDDFAAHLTLEHRAPRDLDESS 
GVRHVRRMFHPGRGLGGPRARRSNMHFTSSSTGGLSS 
SQSSYSPSNREAMDPIAELLSQLSGVRRSAGGQLNSS 
GPSASQLQQLQMQLQLERQHAQAARQQLETARNATRR 
TNTSSVTTTITQSTATTNIANTESSQQTIiQNSQFLLT 

PT.MnDTfMQffT^Pn^MPQPR ATlR QT.T?VnT?T,T.Ti < ?TTiVT?T*! 
rCJjlN JL/irJ\i v lOCi 1 i2i]\UuriaoijK/uJt(OJjr v yCjUDUO x u v xvci 

ES S S SDEDDRGEMADFGAMGC VDIMPLDVALENLNLK 
E3NKGNEPPPPPL 


2259 


A 


1157 


481 


SWPGQAEPSEREFWREAAETRGSEVFEIMNPVYSPG 
S SGVP YANAKGI GYPAGFPMG YAAAAPAYS PNMYPGA 

INlrlryivyx 1 xrVaX rllvVoLorl ovxrt V JrxrX oooxrlN tr 1 \£Ji 

AVYPVRS AYPQQS P YAQQGT YYTQPLYAAPPHVI HHT 
TWQPNGMPATVYPAP I PPPRGNGVTMGMVAGTTMAM 
SAGTLLTAHS PTPVAPHPVTVPTYRA\QGTPTYS YVP 
PQW 


2260 


A 


33 


563 


MVLS VPVI ALGATLGTATS IlaALCGVTCLCRHMHPKK 
GLLPRDQDPDLEKAKPSLLGSAQRFNVKKSTEPVQPR 
ALLKF PDI YGPRP AVTAPE VI N YAD YSLRSTEE PTAP 
AS PQPPNDSRLKRQVTEELFI LPQNGWEDVCVMETW 
NPQKAGSWNQAPKLHYCLDYDCHKAECL* 


2261 


A 


6120 


2968 


HPSPGFDRVRAAMDPNTI IEALRGTMDPALREAAERQ 
LNEAHKSLNFVSTLLQITMSEQLDLPVRQAGVIYLKN 
MITQYWPDRETAPGDISPYTIPEEDRHCIRENIVEAI 
IHSPELIRVQLTTCIHHIIKHDYPSRWTAIVDKIGFY 
LQSDNS ACTniiGI LLCL YQLVKNYE YKKPEERS PLVAA 
MQHFLPVLKDRFIQLLSDQSDQSVLIQKQI FKI FYAL 
VQYTLPLELINQQNLTEWIE ILKTVVNRDVPNETLQV 
EEDDRPELPWWKCKKWALHILARLFERYGSPGNVSKE 
YNE FAEVFLKAFAVGVQQVLLKVL YQYKE KQYMAPRV 
LQQTLNYINQGVSHALTWKNLKPHI QGIIQDVI FPLM 
CYTDADEELWQEDPYEYIRMKFDVFEDFISPTTAAQT 
LLFTACSKRKEVLQKTMGFCYQI LTEPNADPRKKDGA 
LHMIGSLAEILLKKKI\YKDQMEYMLPESMYSPLiF\S 
SELG\YMRARACWVLHYFCEVKFKSDQNLQTALELTR 
RCIiIDDREMPVKVEAAIALQVLISNQEKAKEYITPFI 
RPVMQALLHI IRETENDDLTNVIQKMICEYSEEVTPI 
AVEMTOHIiAMTFNOVIOTGPDEEGSDDKAVTAMGI LN 
TIDTLLSVVEDHKEITQQLEGICIX3VIGTVLQQHVLE 
FYEEIFSLAHSLTCQQVSPQMWQLLPLVFEVFQQDGF 
DYFTDMMPLLHNYVTVDTDTLLSDTKYLEMIYSMCKK 
VLTGVAGEDAECHAAKLLEVI ILQCKGRGIDQCI PLF 
VEAALERLTREVKTSEL*TMGLQVAIAALHYNAYLLIj 
NTLENLHF PNNVE PVTNHFI / QWLNDVDCFLGLHDRR 
MCVLSLCALIDMEQIPQGLNQVSGQILPAFILLFNGL 
KRAYACHAEHENDSDDDDEAEDDDETEELGSDEDDID 
EDGQE YLE I LAKQAGEDGDDEDWEEDDAEETALEGYS 



1 
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JLE7 


SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










TI IDDEDNPVDEYQI FKAI FQTIQNRNPVWYQALTHG 
LNEEQRKQLQDIATLADQRRAAHESKMIEKHGGYKFS j 
AP\ WPS S FNFGGPAPGMN 


2262 


A 


13 


2237 


AEGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVW 
S AWYGKC VKGKGSL PL S AHGI WAWLSRAEWDQVTVY 
LFCDDHKLQRYALNRI TVWRS RSGNEL PLAVASTADL 
I RCKLLDVTGGLGTDELRLLYGMALVRF VNLI SERKT 
KFAKVPLKCLAQEVNI PDWI VDLRHELTHKKMPHIND 
CRRGCYFVLDWLQKTYWCRQLENSLRETWELEEFREG 
IEEEDQEEDKNIWDDITEQKPEPQDDGKSTESDVKA 
DGDSKGSEEVDSHCKKALSHKELYERARELLVSYEEE 
QFTVLEKFRYLPKAI KAWNNPSPRVECVLAELKGVTC 
ENREAVLDAFLDDGFLVPTFEQLAALQIEYEENVDI^ 
DVLVPKPFSQFWQPLLRGLHSQNFTQALLERMLSELP 
ALGISGIRPTYILRWTVELIVANTKTGRNARRFSAGQ 
WEARRGWRLFNCSASLDWPRMVESCLGS PCWAS PQLL 
RI I F\KAMGQGLQDE\EQEKLLRICS I YTQSGENSLV 
QEGSEAS PI GKS P YTLDSLYWS VKPASS SFGSEAKAQ 
QQEEQGSVNDVKEEEKEEKEVIiPDQVEEEEENDDQEE 
EEEDEDDEDDEEEDRMEVGPFSTGQESPTAENARLLA 
QKRGALQGS A WQVS S EDVRWDTF P \ LGRMPRSRPRTP 
AELMLENYDTHVI FWTKPVL\ EQRLEPSTCK\TDTLG 
L\SCGVGS\GNCSNSSSSNFRGAFLLEARGSLH\GL\ 
KTGLQLF 


2263 


A 


1 


528 


LGNTVLHYC SMYS KPECLKLLLRS KPTVDI VNQAGET 
ALDIAKRLKATQCEDLLSQAKSGKFNPHVHVEYEWNL 
RQEE IDESDDDLDDKPS PVKKERS PRPQS FCHS S S I S 
PQDKLALPGFSTPRDKQRLSYGAFTNQI FVSTSTDS P 
TSPTTEAPPLPPRNAGKGPTGPPITPHR 


2264 


A 


422 


2 


APGAS VGRAQAAEG* RGGPTGRPPS ALGVS / EAGRAG 
RAGEGRPVPPAYPLCKSAQTSGPPKARLS\ PPLASCG 
GRGPPGGAACATCAPPAGPARSSRCRRRSPPE *GPR* 
PSRPARPS PGS AASRRQKLTPCRCQFRGLCA 


2265 


A 


1 


1742 


VSAVEFVLHGKDFQVDCKASGSPVP*ISWSLLDGTMI 
NNAMQADDS GHRTRRYTLFNNGTLYFNKVGVAEEGDY 
TC YAQNTLGKDEMKVHLTVI TAAPRI RQSNKTNKRI K 
AGDTAVLDCEVTGD PKPKI FWLLPSNDMI SFSI DRYT 
FHANGSLTINKVKLLDSGEYVCVARNPSGDDTKMYKL 
DWSKPPLINGLYTNRTVI KATAVRHSKKHFDCRAEG 
TPS PE VMWI MPDNI FLTAP YYGSRI TVHKNGTLE I RN 
VRLSDSADF I CVARNEGGESVLWQLEVLEMLRRPTF 

TRFSNGPQS YQYLI ASNGS FI I S KTTREDAGKYRCAA 
RNKVGYIEKLVILEIGQKPVILTYAPGTVKGISGESL 
SLHCVSDGI PKPNI KWTMPSGYWDRPQINGKYI LHD 
NGTLVI KEATAYDRGNYI C KAQNS VGHTLI TVPVM I V 
AYPPRITNRPPRSIVTRTGAAFQLHCVALGVPKPEIT 
WEMPDHSLLSTASKERTHGSEQLHLQGTLVIQNPQTS 
DSGIYKCTAKNPLGSDYAATYIQVI 


2266 


A 


2334 


68 


RWHQAPGPVRQRPPDDLQPGPGL\WMPGPARMTTESA 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion ^possible nucleotide 
insertion) 










GQKIKELLSGIGNISERVSFLKK/RG*PAEQGTDKPQ 

RGHERE\RAL\RQAARAPDPSPAPPAPPGAACRHRMC 

SWPPAC*RTAAASAWTRGGPCASRCSPAPVLSTWTRP 

APPSPTPSPWTSSAPVLGGR*RYPWALVRTAGSTCP* 

PCPA\PVLQSQGGGGRGPCPIiRL*G/PPFWMSAPPTS 

CPSKR\GLPAPEQAHSGHAAVSALPWPGPATHTGPLP 

TRPHPRPWGHFSGNLSGAWQPASRTRLPAGRVPAPIC 

GFHQGVGGA/GSELP*RTATQACPCAVPPCSGSLLRM 

LLWTS*GPEHYLPSR\DGP*WRQRSPHRPRG/VP*PT 

CAQQGPSRPWRFKWKAP\SGRHLQGAPCRCRAHADDG 

DRAGRPGLQRS * S PCAVP PPDPRQ PRDTAAGGADPAR 

PALHGG* GQLLCHRPEAATGVPAAAPPQPHPAVTRRA 

C PWALATLPAS VTAPPGLMG * RETELAWPE PSGKVGP 

GHVGAERS * KCLEAVEHKADSDWEQPRRALNLAGRSF 

ASSAGVSPSLTAAAAPAL/ GLPHCWAAFPPPQQPLRP 

GGSAGHSGPGGP\GNRISGVWTWGEFVTVAATPPGAP 

AAPLGGTTRCPTVPLSHCSH\CPAAHSGTPR\WRVLP 

ETKAQNSMQGAPASARGLVPHQGRASGWPVAGMLNN* 

VPPAGAVPSTVHYFQGHSG\GAVAGGGP*APAPSLLP 

QPG\ HGPPPGAGVF I WGGCSRRS RCRHC PR 


2267 


A 


29 


175 


KSRPGTVAHACNPSTLGSRGGRI I PAQE FKTS LGNT V 
SE \ PCLYLRKNN 


2268 


A 


29 


175 


KSRPGTVAHACNPSTLGSRGGRI I PAQEFKTSLGNTV 
SE\ PCLYLRKNN 


2269 


A 


961 


365 


PRVRLNGCGRLAALGRGLKSFLRGTSLCEEIMSLALR 
SELWDKTKRKKRRELSEEQKQEIKDAFELFDTDKDE 
AI D YHELKVAMRALGFDVKKADVLKI LKDYDRE ATGK 
ITFEDFNEWTDWILERDPHEEILKAFKLFDDDDSGK 
ISLRNLRRVARELGENMSDEELRAMIEEFDKDGDGEI 
NQEEFIAIMTGDI 


2270 


A 


131 


1567 


NKLVTERQI LGDPTYMRQADGRKVLRSS I REFLCSEA 
MFHLGVPTTRAGAC VTS E S T WRDVFYDGLD PLRFLS 
LQMSTQGVQAPAW/RRNDIRVQLLDYVISSFYPEIQA 
AHASDSVQRNAAFFREVTRRTARMVAEWQCVGFCHGV 
LNTDNMSILGLTIDYGPFGFLDRYDPDHVCNASDNTG 
RYAYSKQPEVCRWNLRKLAEALQPELPLELGEAILAE 
E FDAE FQRHYLQKMRRKLGLVQVELEEDGALVS KLLE 
TMHLTGADFTNTFYLLSSFPVELESPGLAEFLARLME 
QCASLEELRLAFRPQMDPRQLSMMLMLAQSNPQLFAL 
MGTRAGIARELERVEQQSRLEQLSAAELQSRNQGHWA 
DWLQAYRARLDKDLEGAGDAAAWQAEHVRVMHANNPK 
YVLRNYIAQNAIEAAERGDFSEVRRVLKLLETPYHCE 
AGAATDAEATEADGADGRQRS YS S KP PLWAAELCVT* 
S S F YPE IQAAHASDS VQRNAAFFRE VTRRTARM VAEW 
QC VGFCHGVLNTDNMS I LGLT ID YGPFGFLDRYDPDH 
VCNASDNTGRYAYSKQPEVCRWNLRKLAEALQPELPL 
ELGEAILAEEFDAEFQRHYLQKMRRKLGLVQVELEED 
GALVS KLLETMHLTGADFTNTF YLLS S F PVELE S PGL 
AEFLARLMEQCASLEELRLAFRPQMDPRQLSMMLMLA 
QSNPQLFALMGTRAGIARELERVEQQSRLEQLSAAEL 
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TABLE 7 



SEQ 
ED 


Method 


Predicted 
beginning 
nucleotide 

IUC411UU OI 

first amino 
acid residue 
of peptide 
sequence 


Predicted 

ending 

nucleotide 

lOCdilUIl UI 

last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










QSRNQGHWADWLQAYRARLDKDLEGAGDAAAWQAEHV 
RVMHANNPKYVLRNYIAQNAI EAAERGDFSEVRRVLK 
LLETPYHCEAGAATDAEATEADGADGRQRS YSS KPPL 
WAAELCVT 


2271 


A 


131 


1567 


NKLVTERQI LGDPTYMRQADGRKVLRSS IREFLCSEA 
MFHLGVPTTRAGACVTSESTWRDVFYDGLDPLRFLS 
LQMSTQGVQAPAW / RRND I RVQLLDYVI S S F YPE I QA 
AHASDS VQRNAAF FRE VTRRTARMVAEWQCVGF CHGV 
LNTDNMS I LGLT I D YGPFGFLDR YD PDHVCNAS DNTG 
R YAYS KQ PE VCRWNLRKLAEALQ PEL PLELGE AI LAE 
E FDAE FQRHYLQKMRRKLGLVQVE LE EDGALVS KLLE 
TMHLTGADFTNTFYLLSSFPVEIjESPGLAEFIiARLME 
QCASLEELRLAFRPQMDPRQLSMMLMLAQSNPQLFAL 
MGTRAGIARELERVEQQSRLEQLSAAELQSRNQGHWA 
DWLQAYRARLDKDLEGAGDAAAWQAEHVRVMHANNPK 
YVI4RNYIAQNAIEAAERGDFSEVRRVLKLLETPYHCE 
AGAATDAEATE ADGADGRQRS YS S KPPLWAAELCVT* 
SSFYPEIQAAHASDSVQRNAAFFREVTRRTARMVAEW 
QCVCji? LHVjVLN 1 JJNMblLQxLiI lDx^irrCjrLxJKxlJFUri 
VCNASDNTGRYAYS KQPEVCRWNLRKLAEALQPELPL 
ELGEAILAEEFDAEFQRHYLQKMRRKLGLVQVELEED 
GALVS KLLETMHLTGADFTNT f ylls s f pvele s pgl 

A T7 t?T . A P T .MTJOr" A Q T .P CT .P T .A T?P PHMTiPP DT .QMMT ,MT . A 
ACi r xxrUrtxjl v i.cj \2 Uio 1jC* JtSxjivxxtt. K xr ^ £*JxJ ir ivy Jjo LTU^ujFLUA 

qsnpqlfalmgtragiarelerveqqsrleqlsaael 
qsrnqghwadwlqayrarldkdlegagdaaawqaehv 
rvmhannpkyvlrnyiaqnai eaaergdfsevrrvlk 

LLETPYHCEAGAATDAEATEADGADGRORS YS SKPPL 
WAAELCVT 


2272 


A 


53 


439 


FFLPLLI 1 1 YCYI FI FRAMRETGRALQTFGACKGNGE 

SLWQRQRLQSECKMAKIMLLVILLFVLSWAPYSAVAL 

VA1? AfiYAHVTiTPYMS SVPAVI AKAS AI HNPI I YAITH 
VAr rivj x xin vui r x i t 0 v rn v imuioni xll 1 * .l x x x rax x 1 1 

PKYRVAIAQHLPCLGVLL 


2273 


A 


9 


410 


MTTTFPPRKMVAQFLLVAGNVANITTVSLWEEFSSSD 
LADLRFLDMSQNQFQYLPDGFLRKMPSLSHLNLHQNC 
LMTLHIREHE PPGALTELDLS HNQLSELHIiAPGLASC 
LGS LRLFNLS SNQLLGVPPG PLY 


2274 


A 


73 


489 


FLLLRS AS PEHTCVKSKTLDPMVI FFTSGTTGF PKMA 
KHSHGLALQPSFPGSRKLRSLKTSDVSWCLSDSGWIV 
ATI WTLVEPWTAGCTVFIHHLPQFDTKVI IQTLVKYP 
INHFWGVSS I YRMI LQQDFTS I RFPALE 


Z Z / D 


7V 

A 


J 


1 TOO 
1/ JO 


t/tttmut .TUMPHDn^rrmyQQQnQnrc! t a cnQnQ o qt.qd 

i_i 1 tVrlrixj 1 tzivi ftl ir\Jv InVoo oyobL olhououoDOiJOlJ 

IYQATESEVGDVDLTRLPEGPVDSEDDEEEDEEIDRT 
DPLQGRDLVRECLEKEPADKTDDDIEQLLEFMHQLPA 
FANMTMS VRRELCS VMI FEWEQAGAI I LEDGQELDS 
WYVILNGTVEISHPDGKVENLFMGNSFGITPTLDKQY 
MHGI VRTKVDDCQFVCI AQQDYWRI LNHVEKNTHKVE 
EEGEIVMVHEHRELDRSGTRKGHIVIKATPERLIMHL 
I EEHSI VDPTYI EDFLLTYRTFLES PLDVGI KLLEWF 
KIDSLRDKVTRIVLLWVNNHFNDFEGDPAMTRFLEEF 
EKNLEDTKMNGHLRLLNIACAAKAICWRQVVLQKASRE 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deIetion,=possible nucleotide 
insertion) 










SPLQFSLNGGSEKGFGIFVEGVEPGSKAADSGLKRGD 
QIMEV 


2276 


A 


3 


1238 


LTKMHLTENPHPQVTHVSSSQSGCSIASDSGSSSLSD 
IYQATBSEVGDVDIjTRLPEGPVDSEDDEEEDEEIDRT 
DPLQGRDLVRECLEKEPADKTDDDIEQLLEFMHQLPA 
FANMTMS VRRELCS VMI FE WEQAGAI I LEDGQELDS 
WYVI LNGT VE I SHPDGKVENL FMGNS FGI T PTLDKQ Y 
MHGI VRTKVDDCQF VC IAQQDYWRI LNHVE KNTHKVE 
EEGEIVMVHEHRELDRSGTRKGHIVIKATPERLIMHL 
IEEHSI VDPTYI EDFLLTYRTFLES PLDVGI KLLEWF 
KIDSLRDKVTRIVLLWVNNHFNDFEGDPAMTRFLEEF 
EKNLEDTKMNGHLRLLNIACAAKAKWRQVVLQKASRE 
SPLQFSLNGGSEKGFGI FVEGVE PGSKAADSGLKRGD 
QIMEV 


2277 


A 


1 


794 


FRGFLDRGDCAALPCTYPHSPCSH*GGNCLPSLLTRP 
CVKA*PQMSGRKSSMRRWRRQSRLTAGTSS*TPTSST 
MC * ALVGS S TWNCMLQAGSTAPGAGT PGSR PTWS S S S 
TCSWTAPSGRARCACASSSSCAMSAARRGWTSPACWR 
RTSRAWWTTS S P ACAS S ATAS VAASTASTWPAARTTG 
GTAESSARPARRASCTGSPARSCWRRRRPPTPSPGRP 
APPSRRTRRAQAGTSALSPGACFGPRSCC*SSTCSSL 
SVAPY 


2278 


A 


269 


832 


MGSSRLAALLLPLLLIVIDLSDSAGIGFRHLPHWNTR 
CPLASHTDDSFTGSSAYI PCRTWWALFSTKPWCVRVW 
HCSRCLCQHLLSGGSGLQRGLFHLLVQKSKKSSTFKF 
YRRHKMPAPAQRKLL PRRHLS EKSHHI S I PS PD I S HK 
GLRSKRTPPFGSRDMGKAFPKWDSPTPGGDRPSSFEL 
LP* 


2279 


A 


269 


832 


MGSSRLAALLLPLLLIVIDLSDSAGIGFRHLPHWNTR 
CPLASHTDDSFTGSSAYI PCRTWWALFSTKPWCVRVW 
HCSRCLCQHLLSGGSGLQRGLFHLLVQKSKKSSTFKF 
YRRHKMPAPAQRKLLPRRHLS EKSHHI S I PSPDI SHK 
GLRSKRTPPFGSRDMGKAFPKWDSPTPGGDR[PSSFEL 

LP* 


2280 


A 


2 


381 


VLPTAQGKLYQDDLKVNPANVSHLVSPFTWQGPGGHL 
KAPQWTTSSLFPFQI RNVGTGLCADTKHGALGS PLRL 
EGCVRG\ RGEAAWNNMQVRAAPQGLAARF SETS AAWG 
ADTASWEGEAWVSDK 


2281 


A 


1 


993 


MRDLFGTRLRRAEDVF PPVIGVAAHKGGVYKTS VSVH 
LAQDLALKGLRVLLVEGND PQGTASMYHGWVPDLHI H 
AEDTLLPF YLGE KDDVT YAI KPTC W PGLD 1 1 PS CLAL 
HRI ETELMGKFDEGKLPTDPHLMLRLAI ETVAHDYDV 
IVIDSAPNLGIGTINWCAADVLIVPTPAELFDYTSA 
LQFFDMLRDLLKNVDLKGFE PDDLKKS FKSPE PRLFT 
PEEFFRIFNRSIDAFKDFWASETSDCWSSTLSPEK 
VLRASWKRDSDNSLKSLSPTQIRLGEVLTPVMSAFWE 
AEVWNSGDSDDMALDFDCTSSEVDAESTNRKVLRP 


2282 


A 


3 


582 


SLYQFS WETAGPGTLVGRLRAQDPDLGDNALMAYS I 
LDGEGSEAFSI STDLQGRDGLLTVRKPLDFESQRS YS 
FRVEATNTL I DPAYLRRGPF KDVAS VRVAVQDAPE P P 
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ID 
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location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
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nucleotide 
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last amino 
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residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










AFTQAAYHLTVPENKAPGTLVGQI SAADLDS PAS P I R 
YSILPHSDPERCFSIQPEEGTIHTAAPLDREARAWHN 

LTVLATEL 


2283 


A 


3 


582 


SLYQFSWETAGPGTLVGRLRAQDPDLGDNALMAYSI 
LDGEGSEAFSISTDLQGRDGLLTVRKPLDFESQRSYS 
FRVEATNTL I DPAYLRRGPFKDVAS VRVAVQDAPE P P 
AFTQAAYHLTVPENKAPGTLVGQ I SAADLDS PAS P I R 
YSILPHSDPERCFSIQPEEGTIHTAAPLDREARAWHN 
LTVLATEL 


2284 


A 


1 


831 


KNVWKR WKKRF FVLVQ VI Q YT F AMCS YRE KKAE PQEL 
LQLDGYTVDYTDPQPGLEGGRAFFNAVKEGDTVIFAS 
DDEQDRI LWVQAMYRATGQS HKPVP PTQVQKLN AKGG 
NVPQLDAPISQFYADRAQKHGMDEFISSNPCNFDHAS 
LFEMVQRLTLDHRLNDS YS CLGWFS PGQVFVLDE YCA 
RNGVRGCHRHLCYLRDLLERAENGAMIDPTLXHYSFA 
FCASHVHGNRPDGI GNC * LLKKRNVF * RKS KEEXSXV 
LLRKIRLQHFRXLLFPFG 


2285 


A 


140 


445 


MQPSGLEGPGTFGRWPLLSLLLLLLLLQPVTCAYTTP 
GPPRALTTLGAPRAHTMPGTYAPSTTLSSPSTQGLQE 
QARALMRDFPLVDGHNDLPLVLRQVYHN 


2286 


A 


294 


1568 


MSLTIWTVCGVLSLFGALSYAELGTTIKKSGGHYTYI 
LEVFGPLPAFVRWVELLI IRPAATAVI SLAFGRYIL 
EPFFIQCEIPELAIKLITAVGITWMVLNSMSVSWSA 
RI Q I FLTFCKLTAI LI 1 1 VPGVMQL I KGQTQNF KDAF 
SGRDS S I TRLPLAFYYGMYAYAGWFYLNFVTEE VENP 
EKTI PLAICI SMAI VTIGYVLTNVAYFTTINAEELLL 
SNAVAVTFS ERLLGNF S LAVP I F VALSC FGSMNGGVF 
AVSRLFYVASREGHLPE I LSMI HVRKHTPLPAVI VLH 
PLTMIMLFSGDLDSLLNFLSFARWLFIGLAVAGLIYL 
RYKCPDMHRPFKVPLFI PALFSFTCLFMVALSLYSUP 
FSTGIGFVITLTGVPAYYLFIIWDKKPRWFRIMSEKI 
TRTLQI I LEWPEEDKL* 


2287 


A 


3397 


630 


S PGGRTPAARDSWREVI QNS KEVS I VYWQEKNCCAS 
SAVRCKLSRRGDGQA* C * EINQ\NLAEEAGLNITH\ I 
CLA\PDSSEAEIIDEILKINEDTRVHGLALQISENLF 
SNKVLNALKPEKDVDGVTDINLGKLVRGDAHECFVS P 
VAKAVIELLEKSGVNLDGKKILWGAHGSLEAALQCL 
FQRKGSMTMSIQWKTRQLQSKLHEADIWLGSPKPEE 
IPLTWIQPGTTVLNCSHDFLSGKVGCGSPRIHFGGLI 
EEDDVI LLAAALRI QNMVS SGRRWLREQQHRRWRLHC 
LKLQPLS PVPSDI E I SRGQTPKAVDVLAKEIGLLADE 
IEI YGKS KAKVRLS VLERLKDQADGKYVLVAGI TPTP 
LGEGKSTVTIGLVQALTAHLNVNSFACLRQPSQGPTF 
GVKGGAAGGGYAQVI PMEE FNLHLTGD I HAI TAANNL 
LAAAIDTRILHENTQTDKALYNRLVPLVNGVREFSEI 
QLARLKKLGINKTDPSTLTEEEVSKFARLDIDPSTIT 
WQRVLDTNDRFLRKI TI GQGNTEKGHYRQAQFD I AVA 
SEIMAVLALTDSLADMKARLGRMWASDKSGQPVTAD 
DLGVTGALTVLMKDAI KPNLMQTLEGT PVFVHAGPFA 
NI AHGNS S VLADICI ALKLVGE EGFWTEAGFGADIGM 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 

ftf npnfiHp 

sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

rpsidiip of 

peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,-possible nucleotide 
insertion) 










EKFFNI KCRASGLVPNVWLVATVRALKMHGGGPSVT 
AGVPLKKEYTEENIQLVADGCCNLQKQIQITQLFGVP 
WVALNVFKTDTRAEI DLVCE LAKRAGAFDAVPC YHW 
SVGGKGSVDLARAVREAAS KRSRFQFLYDVQVP IVDK 
IRTI AQAVYGAKDIELS PEAQAKI DRYTQQGFGNLPI 
CMAKTHLSLSHQPDKKGVPRDFIIjPISDVRASIGAGF 
IYPLVGTMSTMPGLPTRPCFYDIDLDTETEQVKGLF 


2288 


A 


474 


4247 


IISIISTSNKIKMSEAPRFFVGPEDTEINPGNYRHFF 

HHADEDDEEEDDSPPERQIWGICSMAKKSQIPNPMK 

EILERISLFKYITLWFEEEVILNEPVENWPLCDCIil 

S FHS KGF PLDKAVAYAKLRN P F VI NDLNMQ YL I QDRR 

EVYSILQAEGILLPRYAILNRDPNNPKECNLIEGEDH 

VEVNGEVFQKPFVEKPVSAEDHNVYIYYPTSAGGGSQ 

RLFRKIGSRSSVYSPESNVRKTGSYIYEEFMPTDGTD 

VKVYTVGPDYAHAEARKSPALDGKVERDSEGKEVRYP 

VI LNAREKLI AWKVCLAFKQTVCGFDLLRANGQS YVC 

DVNGFSFVKNSMKYYDDCAKILiGNIVMRELAPQFHIP 

WS I PLEAEDI PI VPTTSGTMMELRCVI AVIRHGDRTP 

KQKMKMEVRHQKFFDLFEKCDGYKSGKLKLKKPKQLQ 

EVLDIARQLLMELGQNNDSEIEENKPKLEQLKTVLEM 

YGHFSGINRKVQLTYLPHGCPKTSSEEEDSRREEPSL 

LLVLKWGGELTPAGRVQAEELGRAFRCMYPGGQGDYA 

GFPGCGLLRLHSTYRHDLKIYASDEGRVQMTAAAFAK 

GLLALEGELTPILVQMVKSANMNGLLDSDSDSLSSCQ 

QRVKARLHEILQKDRDFTAEDYEKLTPSGSISLIKSM 

HLIKNPVKTCDKVYSLIQSLTSQIRHRMEDPKSSDIQ 

L YHS ETLE LMLRRWSKLE KD FKTKNGRYD I S KI PDI Y 

DCIKYDVQHNGFLEIRKTQWELYRLSKALADIVIPQE 

YGITKAEKLEIAKGYCTPLVRKIRSDLQRTQDDDTVN 

KLHPVYSRGVLS PERHVRTRL YFTSESHVHSLLS I LR 

YGALCNESKDEQWKRAMDYIjNVVNELNYMTQIVIMLY 

EDPNKDLSSEERFHVELHFSPGAKGCEEDKNIjPSGYG 

YRPASRENEGRRPFKIDNDDEPHTSKRDEVDRAVILF 

KPMVSEPI HI HRKS PLPRSRKTATNDEE S PLS VS S PE 

GTGTWLHYTSGVGTGRRRRRSGEQITSSPVSPKSLAF 

TS S I FGS WQQWSENANYLRT PRTLVEQKQNPTVGSH 

CAGLFSTSVLGGSSSAPNLQDYARTHRKKLTSSGCID 

DATRGSAVKRFYISFARHPTNGFELYSMVPSICPLET 

LHNALSLKQVDEFLASIASPSSDVPRKTAEISSTALR 

SSPIMRKKVSLNTYTPAKILPTPPATLKSTKASSKPA 

TSGPS S AWPNTS S RKKNI TS KTETHEHKKNTGKKK 


2289 


A 


3 


552 


FIDDELATEWSLTMETLTKVLARNLYSLDLSDIiPLDK 
LSEQKQKKHKGKGVGHEFQKVSVDKSFSRGWSRDQPG 
QAPMRQRS ATTTGS PGTEKARSI VRQKTVDI DDAQI L 
PRSTRVRHFSQSEETGNEVFGALNEEQPLPRSSSTSD 
ILEPFTVERAKGAVPVIDSSSRHAPSLQSFTEASS 


2290 


A 


3 


147 


Q PLNHYF I C S SHNT YLVGDQLCGQS SVEGYI RCSGGR 
EGVQLMRGTM 


2291 


B 


1 


498 


MDLCQKNETDLENAENNEIQFTEETEPTYTCPDGKSE 
KNHVYCLLD VS DI TLEQDE KAKE F I IGTGWEEAPPQR 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletio impossible nucleotide 
insertion) 










SSPAVGLRQPGLPGPHLLGPTGGRKGLGGTRHQGPEE 
EQRNAFGTAWTPETHPTRGHTGRTEAAVAGGDARPEG 
RLLGSRQIiNRLPDAETQ 


2292 


A 


963 


5 


LD FLCHRDMGDN I TS I TE FLLLGF P VGPRI QMLLFGL 
FSLFYVFTLLGNGTILGIiISLDSRLHAPMYFFLSHL\ 
AWDI AYACNTVPRMLVNLLHPAKPI SFAGRMMQTFIi 
FSTFAVTECLLl.WMSYDLYV\AICHPLRYliAIMTWR 
VCITLAVTSWTTGVLLSLIHLVLLLPLPFCRPQKIYH 
FFCEILAVLKLACADTHINENMVLAGAI SGLVGPLST 
IWSYMCILCAILQIQSREVQRKAFCTCFSHLCVIGL 
FYGTAI IMYVGPRYGNPKEQKKYLLLFHSLFNPMLNP 
LICSLRNSEVKNTLKRVLGVERAL 


2293 


A 


1306 


158 


I S YC PKF PNRDQRDKDGDGVGDACDS C PDVSNPNQSD 
VDNDIiVGDSCDTNQDSDGDGHQDSTDNCPTVINSAQL 
DTDKDGIGDECDDDDDNDGIPDLVPPGPDNCRLVPNP 
AQEDSNSDGVGDI CESDFDQDQVIDRIDVC PENAEVT 
LTDFRAYQTWLDPEGDAQI D PNWWLNQGME I VQTM 
NS D PGLAVG YTAF \ NGVDF EGT FHVNTQTDDD YAGF I 
FGYQDS S S F YWMWKQTEQT Y WQATPFRAVAE PGI QL 
KAVKSKTGPGEHLRNSLWHTGDTSDQVRLLWKDSRNV 
GWKDKVSYRWFLQHRPQVGYIRVRFYEGSELVADSGV 
T I DTTMRGGRLGVFC F SQEN I I WSNLKYRCNDT I PED 
FQEFQTQNFDRFDN 


2294 


A 


4701 


866 


DAPGRPPVRLPTMELEDGWYQEEPGGSGAVMSERVS 

GLAGS I YRE FERLI VRYDEE WKELI PLWAVLENLD 

SVFAQDQEHQVELELLRDDNEQLITQYEREKALRKHA 

EEKFIEFEDSQEQEKKDLQTRVESLESQTRQLELKAK 

NYADQI S I LEERE AELKKE YNALHQRHTEM I HN YMEH 

LERTKLHQLSGSDQLESTAHSRIRKERPISLGIFPLP 

AGDGLLTPDAQKGGETPGSEQWKFQELSQPRSHTSLK 

DELSDVSQGGSKATTPASTANSDVATI PTDTPLKEEN 

EGFVKVTDAPNKSEISKHIEVQVAQETRNVSTGSAEN 

EEKSEVQAIIESTPELDMDKDLSGYKGSSTPTKGIEN 

KAFDRNTESLFEELSSAGSGLIGDVDEGADLLGMGRE 

VENLILENTQLLETKNALNIVKNDLIAKVDELTCEKD 

VLQGEIiEAVKQAKItKLEEKNRELEEELRKARAEAEDA 

RQKAKDDDDSDI PTAQRKRFTRVEMARVLMERNQYKE 

RLMELQEAVRWTEMIRASRENPAMQEKKRSSIWQFFS 

RLFSSSSNTTKKPEPPVNLKYNAPTSHVTPSVKKRSS 

TLSQLPGDKSKAFDFIiSEETEASLASRREQKREQYRQ 

VKAHVQKEDGRVQAFGWSLPQKYKQVTNGQGENKMKN 

LP VPVYLRPLDKKDTSMKLWCAVGVNLjbvjVjiv. 1 KLNaiao 

WGASVFYKDVAGLDTEGSKQRSASQSSLDKLDQELK 

EQQKELKNQEELSSLVWICTSTHSATKVLIIDAVQPG 

NILDSFTVCNSHVLCIASVPGARETDYPAGEDLSESG 

QVDKASLCGSMTSNSSAETDSLLGGITWGCSAEGVT 

GAATS PSTNGAS PVMDKPPEMEAENSEVDENVPTAEE 

\ATEATEGNAGSAEDTV\DI SQTGVYTEHVFTDPLG\ 

VQI PEDLS PVYQS SNDSDAYKDQI SVLPNEQDLVREE 

AQKMS S LL PTMWLGAQNGCL YVHS S VAQ WRICCLHS I K 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Un known, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










LKDS I LS I VHVKGI VLVALADGTLAI FHRGVDGQWDL 
SNYHLLDLGRPHHS I RCMTWHDKVWCGYRNKI YWQ 
PKAMKI E KS FDAHPRKESQ VRQLAWVGDGVWVS I RLD 
STLRLYHAHT YQHLQDVDI E P YVS KMLGTGKLGFS FV 
RITALMVSCNRLWVGTGNGVI I S I PLTETVI LHQGRL 
LGLRANKTSGVPGNRPGSVIRVYGDENSDKVTPGTFI 
PYCSMAHAQLCFHGHRDAVKFFVAVPGQVI SPQSSSS 
GTDLTGDKGRGHIiHRSLWRRP 


2295 


A 


1 \ 


1668 


AAAAAAGAF AGRRAACGAVLLTELLERAAF YGI TSNL 
VLFLNGAPFCWEGAQASEALLLFMGLTYLGSPFGGWL 
ADARLGRARAI LLSLALYLLGMLAF PLLAAPATRAAL 
CGS ARLLNCTAPGPD AAARCCS PATFAGLVLVGLGVA 
TVKANITPFGADQVKDRGPEATRRFFNWFYWS INLGA 
ILSLGGI AYIQQNVSFVTGYAI PTVCVGLAFWFLCG 
QSVFITKPPDGSAFTDMFKILTYSCCSQKRSGERQSN 
GEGIGVFQQSSKQSLFDSCKMSHGGPFTEEKVEDVKA 
LVKI V P VFLALI P YWTVYFQMQTT YVLQSLHLR I PE I 
SNITTTPHTLPAAWLTMFDAVLILLLIPLKDKLVDPI 
LRRHGLLPSSLKRIAVGMFFVMCSAFAAGILESKRLN 
LVKEKTINQTIGNWYHAADLSLWWQVPQYLLIGI SE 
I FASI AGLE FAYSAAPKSMQSAIMGLFFFFSGVGS FV 
GSGLLALVS I KAIGWMS SHTDFGNINGCYLNYYFFLL 
AAIQGATLLLFLI ISVKYDHHRDHQRSRANGVPTSRR 
A 


2296 


A 


132 


695 


TQRAAT PL PNS PQEAAILGSRRNQAGRVREKVYRSLP 
GPAFLGESWKRLSVLQESFSHLTPRQSQMRKSDIFPK 
SLPSQFFGSFGKPVACVTCACSLQLLKFIPEKSDIDL 
LVYRI DHYQQRLQALFFKKKFQERLAEAKPKVEGRAE 
GCRRLRVESYLIMILEKHFPDILNMPSELQHLPEAAK 
VK 


2297 


A 


5 


505 


CKKCQKKFSSGYQLILHHRVHVIERPYECKECGKNFR 
SGYQLTLHQRFHTGEKPYECTECGKNFRSGYQLTVHQ 
RFHTGEKTYECTQCGKAFIYASHIAQHERIHTGGKPY 
ECQECGRAFSQGGHLRIHQRVHTGEKPYKCKECGKTF 
STRSXLVEHGRVHTDEKPY 


2298 


A 


102 


449 


PAPASGFTQTWGDACDPAAPQRPLEACFS VQS RTS S P 
MEPPIPQSAPLTPNSVMVQPLLDSRMSHSRLQHPLTI 
LPIDQVKTSHVENDYIDNPSLALTTGPKRTRGGAPEL 
APT PA 


2299 


A 


402 


2624 


MAESRGRLYLWMCLAAALASFLMGFMVGWFIKPLKET 
TTS VRYHQS I RWKLVSEMKAENI KS FLRS FTKL PHLA 
GTEQNFLLAKKIQTQWKKFGLDSAKLVHYDVLLSYPN 
ETN AN YI S I VDEHETE IFKTSYLEPP PDG YENVTN I V 
PPYNAFSAQGMPEGDLVYVNYARTEDFFKLEREMGIN 
CTGKI VI ARYGKI FRGNKVKNAMLAGAIGI ILYSDPA 
D YFAPEVQ P Y PKGWMLPGTAAQRGNVLNLNGAGDPLT 
PGYPAKE YTFRLDVEEGVGI PRI PVHPIGYNDAEILL 
RYLGGIAPPDKSWKGALNVSYSIGPGFTGSDSFRKVR 
MHVYNINKITRIYNWGTIRGSVEPDRYVILGGHRDS 
WVFGAIDPTSGVAVLQEI ARS FGKLMS KGWRPRRTI I 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/-possible nucleotide deIetion,=possible nucleotide 
insertion) 










FASWDAEEFGLLGSTEWAEENVKILQERSIAYINSDS 
S I EGNYTLRVDCTPLLYQLVYKLTKEI PS PDDGFESK 
FLYESWVEKDPSPENKNLPRINKLGSGSDFEAYFQRL 
GIASGRARYTKNKKTDKYSSYPVYHTIYETFELVEKF 
YDPTFKKQLSVAQLRGALVYELVDSKII PFNIQDYAE 
ALKNYAAS I YNLS KKHDQQLTDHGVS FDSLFSAVKNF 
SEAASDFHKRLIQVDLNNPIAVRMMNDQLMLLERAFI 
DPLGLPGKLF YRHI I FAPS SHNKYAGES FPGI YDAI F 
DI ENKANS RLAWKEVKKHI S I AAFTI QAAAGTLKE VL 
* 


2300 


A 


74 


520 


PGVGPCLSVPPSAPSLVFRS VAGGAGMAERGLE PS PA 
AVAALPPEVRAQLAELELELSEGDITQKGYEKKRSKL 
LSPYS PQTQETDSAVQKELRNQTPAPSAAQTSAPSKY 
HRTRSGGARDERYRSGEEKLQNGQLNRFPNSSMNCVS 


2301 


A 


6256 


5813 


MALQLWALTLLGLLGAGASLRPRKLDFFRSEKELNHL 
AVDEASGWYLGAVNALYQLDAKLQLEQQVATGPVLD 
NKKCT P P I E ASQCHEAEMTDNVNQLLLVD P PRKRLVE 
CGQLLKGILRSARPEQHLPPPVLRGRQRGEVFRGQQ* 


2302 


A 


402 


578 


MPTYWLANLRPGLQPFLLHFLLEWLAVFCCKIMVLAA 
AGLLPTLHMASFFSNALYNCFY 


2303 


A 


186 


1338 


TRMSRHEGVSCDACLKGNFRGRRYKCLI C YDYDLCAS 
CYESGATTTRHTTDHPMQCILTRVDFDIjYYGGEAFSV 
EQPQSFTCPYCGKMGYTETSLQEHVTSEHAETSTEVI 
CPICAALPGGDPNHVTDDFAAHLTLEHRAPRDLDESS 
GVRHVRRMFHPGRGLGGPRARRSNiMHFTSSSTGGLSS 
SQS S YS PSNREAMDP I AELLSQLSGVRRSAGGQLNS S 
GPSASQLQQLQMQLQLERQHAQAARQQLETARNATRR 
TNTS S VTTT I TQS TATTN I ANTE S SQQTLQNSQFLLT 
RLNDPKMSETERQSMESERADRSLFVQELLLSTLVRE 
ESSSSDEDDRGEMADFGAMGCVDIMPLDVALENIjNLK 
ESNKGNEPPPPPL 


2304 


A 


126 


397 


PLTEDGSPGPPPEGFKDLRNQRPPPHTGPWRGPGPSG 
PPRSGQVPDNSTRCFLSDFWSPQGDQRPSCPYTGARP 
RQGAAQHLRC PSRRRR 


2305 


A 


3 


457 


RAFDVRRKKSLRPCCPRDFHAGCLTVSGPSTVMGAVG 
ESLSVQCRYEEKYKTFNKYWCRQPCLPIWHEMVETGG 
SEGWRSDQVI I TDH PGDLTFTVTLENLTADDAGKYR 
CGIATILQEDGLSGFLPDPFFQVQVLVSSASSTENSV 
KTP 


2306 


A 


1 


1117 


NSRVDDFVAVMAPRTLVLLLSGALALTQTWAGSHSMR 
YFFTSVSRPGRGEPRFIAVGYVDDTQFVRFDSDAASQ 
RMEPRAPWI EQEGPE YWDGETRKVKAHSQTHRVDLGT 
LRGYYNQSEAGSHTVQRMYGCDVGSDWRFLRGYHQYA 
YDGKDYIALKEDLRSWTAADMAAQTTKHKWEAAHVAE 
QLRAYLEGTCVEWLRRYLENGKETLQRTDAPKTHMTH 
HPISDHEATLRCWALSFYPAEITIiTWQRDGEDQTQDT 
ELVETRPAGDGTFQKWAAVWPSGQEQRYTCHVQHEG 
LPKPLTLRWEPSSQPTI PI VGI IAGLVLFGAVITGAV 
VAAVMWRRKSSDRKGGSYSQAASSDSAQGSDVSLTAC 
KV 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
iirsi anuuu 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 

loct amino 

acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possibIe nucleotide deletion,=possible nucleotide 
insertion) 


2307 


A 




4Q1 

t y x 


DAWVAHASGELPPQTTKTLARFI PEVAVAYPKS KPLT 
TQI KI KKPPKVTMKTGKSLLHLHSTLEMFAARWRSKA 
PMSLFLLEVHFNLKVQYS VHENQLQMATSLDRRGN/ Y 
TGFITSYLEEAYI PVWDVLQVGLPLPDFLAMNYNLA 
ELDIVENALMLDLKLG 


2308 


A 


o 


dQI 


DAWVAHASGEL P PQTTKTLARF I PE VAVAYPKS KPLT 
TQIKIKKPPKVTMKTGKSLLHLHSTLEMFAARWRSKA 
PMSLFLLEVHFNLKVQYSVHENQLQMATSLDRRGN/Y 
TGFITSYLEEAYI PVVNDVLQVGLPLPDFLAMNYNLA 
ELDIVENALMLDLKLG 


2309 


A 


3 




DAWVAHASGEL PPQTTKTLARF I PEVAVAYPKS KPLT 
TQIKIKKPPKVTMKTGKSLLHLHSTLEMFAARWRSKA 
PMSLFLLEVHFNLKVQYS VHENQLQMATSLDRRGN/ Y 
TGF ITS YLEEAYI PWNDVLQVGLPLPDFLAMNYNLA 
ELDIVENALMLDLKLG 


2310 


A 


3 


491 


DAWVAHASGELP PQTTKTLARF I PEVAVAYPKS KPLT 
TQI KIKKPPKVTMKTGKSLLHLHSTLEMFAARWRSKA 
PMSLFLLEVHFNLKVQYSVHENQLQMATSLDRRGN/Y 
TGFITSYLEEAYI PWNDVLQVGLPLPDFLAMNYNLA 
ELDIVENALMLDLKLG 


2311 


A 


75 


739 


APRAAPRLTMVSRMVSTMLSGLLFWLASGWTPAFAYS 
PRTPDRVSEAD I QRLLHGVMEQLGI ARPRVE YPAHQA 
MNLVGPQS I EGGAHEGLQHLGP FGNI PNI VAELTGDN 
I PKDFSEDQGYPDPPNPCPVGKTADDGCLENTPDTAE 
FSREFQLHQHLFDPEHDYPGLGKWNKKLLYGKMKGGE 
RRKRRSVNPYLQGQRLDNWAKKSVPHFSDEDKDPE 


2312 


A 


2 


606 


PSIRKHGTHPFPPT*SSPSGSC\SHCIAHSQCRQSPP 
HAS C * RGSRWG * SGRAGWPAPGCR * AAPGLAGS AHPR 
PPPSNPRCPPPDAGPPGSGDPGLAAPEPSNHGRQHTA 
AAAAAGESQRHGRPGLAA* QP PLDTGPAARGS P PAPP 
GARPRGGGRQHRPQGLPQAQPQ*APGVRAAPRAAAPP 
\GHAGPDQAPEKAARTRG 


2313 


A 


42 


706 


PRGQMASTGLELLGMTLAVLGWLGTLVS CALPLWKVT 
AFIGNSIWAQWWEGLWMSCWQSTGQMQCKVYDSL 
LALPQDLQAARALCVI ALLLALLGLLVAI TGAQCTTC 
VEDEGAKARI VLTAGVI LLLAGI LVL I PVCWTAHAI I 
QDFYNPLVAEALKRELGASLYLGWAAAALLMLGGGLL 
CCTCPPPQVERPRGPRLGYSI PSRSGASGLDKRDYV 


2314 


A 


2 


484 


FVANMLCGLSRETPGEADDGPYSKGGKDAGGADVCLtf 
CRRQS I PEE FRGI TWELI KKEGSTLGLTI SGGTDKD 
GKPRVSNLRPGGLAARSDLLNIGDYIRSVNGIHLTRL 
RHDEI ITLLKNVGERWL/EAPENNPRI ISKTVDVSL 
YKEGNS FGFVLRGQ 


2315 


A 


326 


2002 


GLSRMSTETELQVAVKTSAKKDSRKKGQDRSEATLI K 
RFKGEGVRYKAKLIGIDEVSAARGDKLCQDSMMKLKG 
WAGARSKGEHKQKI FLTISFGGIKIFDEKTGALQHH 
HAVHEISYIAKDITDHRAFGYVCGKEGNHRFVAIKTA 
QAAEPVI LDLRDLFQL I YELKQREELE KKAQKDKQCE 
QAVYQTILEEDVEDPVYQYIVFEAGHEPIRDPETEEN 
IYQVPTSQKKEGVYDVPKSQPVSAVTQLELFGDMSTP 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X-Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










PDITSPPTPATPGDAFIPSSSQTLPASADVFSSVPFG 1 
TAAVPSGYVAMGAYLPS FWGQQ PLVQQQMVMGAQP PV 

-nrwrtjcnr* r^dt AWrcOPPT.TPDZiTr^PWPTVAfiOFPPAAF 

MPTQTVMPLPAAMFQGPLTPLATVPGTSDSTRSSPQT 
DKPRQKMGKETFKDFQMAQPPPVPSRKPDQPSLTCTS 
EAFSS YFNKVGVAQDTDDCDDFDI SQLNLTPVTSTTP 
STNS PPTPAPRQS SPSKS SASHASDPTTDD I FEEGFE 
SPSKSEEQEAPDGSQASSNSDPFGEPSGEPSGDNISP 

QGR 


2316 


A 


132 


428 


VNVLNQE I EAFSLSEDTS SGLPEDRWS VS FRVLYP I 
VITSIiGVFYDANDVGFQRNITVKLYQAEQEEALFIAR 
FSPPSCGVQVNKLWYKPVEQFILPE 


2317 


A 


2334 


1226 


TAAAPVAPGTMDDATVLRKKGYIVGINLGKGSYAKVK 
SAYSERLKFNVAVKI J.AKJvK.1 r lUr VifiKr Ltft^ciinuxxj 
ATVNHGS I IKTYEIFETSDGRI YI IMELGVQGDLLEF 
I KCQGALHED VARKMFRQLS S AVKYCHDLDI VHRDLK 
CENLLLDKDFNIKLSDFGFSKRCLRDSNGRIILSKTF 
CGS AAYAAPEVLQS I PYQPKVYDI WSLGVILYI MVCG 
SMPYDDSDIRKMLRIQKEHRVDFPRSKNLTCECKDLI 
vt3mt r\\ onuc\ WT?T.HTnT7TTi c ?H c ?WIjOPPKPK\ATSSA 
SFKREGEGKYRAECKLDTKTGLRPDHRPDHKLGAKTQ 
HRLLWPENENRMEDRLAETSRAKDHHI SGAEVGKAS 
T 


2318 


A 


993 


848 


tdv7\tdt 7iPPPnwPT?C!PQPRMATHHTLWMGIiALLGVL 
GDLQAAPEAQVSVQPNFQQDKFLGRWFS AGLASNS S W 
LRE KKAALSMCKS WAPATDGGLNLTSTFLRKNQCET 
RTMLLQPAGSLGSYSYRSPHWGSTYSVSVVETDYDQY 
ALLYSQGSKGPGEDFRMATLYSRTQTPRAELKEKFTA 
FCKAQGFTEDTIVFLPQTDKCMTEQ 


2319 


A 


2 


394 


AI HVRCLLS PGHTAGHMSYFLWEDDCPDPPALFSGDA 
t 0Tr7\rir | r«Gr'T.wnQAnnMYnSIjAEIjGTLPPETKVFCGH 
EHTLSNLEFAQKVEPCNDHKRDEDDVPTVPSTLGEER 
LYNPFLRVAEEPVRKFTGKA 


2320 


A 


2 


762 


LEEVLKSELSGNFEKTALALLDHPSEYAARQLQKAMK 
GLGTDESVLIEFLCTRTNKEI I AIKEAYQRLFDRSLE 
SNVKGDTSGNLKKI LVSLLQANRNEGDDVDKDLAGQD 
AKDLYDAGEGRWGTDELAFNEVLAKRSYKQLRATFQA 
YQILI GKDI EEAI EEETSGDLQKAYLTLVRCAQDCED 
YFAERLYKSMKGAGTDEETLIRIIVTRAEVDLQGIKA 
KFQEKYQKSLSDMVRSDTSGDFRKLLVALLH 


2321 


A 


•a 


1335 


QHS SRAGI SSVAMPWAPLGHSGSHQLCVTFS SLHCLT 
RRNMHQMTDGLDKPGQIRWPLAITLAIAWILVYFCIW 
KGVGWTGKWYFSATYPYIMLI ILFFRGVTLPGAKEG 
ILFYITPNFRKLSDSEVWLDAATQI FFSYGLGLGSLI 
ALGS YNS FHNNVYRDS 1 1 VCC INSCTSMFAGFVI FS I 
VGFMAHVTKRS I ADVAASGPGLAFLAYPEAVTQLPI S 
PLWAILFFSMLLMLGIDSQFCTVEGFITALVDEYPRL 
LRNRRELFIAAVCIISYLIGLSNITQGGIYVFKLFDY 
YSAS04SLLFLVFFECVSISWFYGVNRFYDNIQEMVG 
SRPCI WWKLCWS FFTPI I VAGVFI FSAVQMTPLTMGN 
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SEQ 

ID 


Method 


Predicted 
oegmiiing 
nucleotide 
location oi 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
nucleotide 

lUvallUIl vl I 

last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X-Unknown, *=Stop codon, 
/=nn««ihlp nucleotide deletion.=DossibIe nucleotide 
insertion) 










YVF PKWGQGVGWLMALS SMVL I PGYMAYMFLTLKGSL 
KQRI QVMVQ PS EDI VR PENGPEQPQAGS STSKE AYI 


2322 


A 


775 


945 


MMYILLVFLTLWLL I EM I HCLQNGDHRRTRP PTETGW 
LPLRFHLRTGKI LRYLRGE * 


2323 


A 


197 


598 


MSALRPLLLLLLPLCPGPGPGPGSEAKVTRSCAETRQ 
VLGARGYSLNLIPPALISGEHLRVCPQEYTCCSSETE 
QRLIRETEATFRGLVEDSGSFLVHTLAARHRKFDEFF 
LEMLFETLAFFCPDLSSHSTGA* 


2324 


A 


2031 


56 


GTAETFHSVHFCPQPVPKAPESPSLDSALASPLDPQA 
LACTPASPPDSQPPASPQDSEALDFETPSSSIiAPQTP 
DS ALASETLAS PQSLP PAS PLLEDREEGDLGKASELA 
ETPKEEKAEGAAMLELVGS ILRGCVPGVYRVQTVPSA 
RRPVVKFCHRPSGLHGDVSLSNRLALHNSRFLSLCSE 
LDGRVRPLVYTLRCWAQGRGLSGSGPLLSNYALTLLV 
I YFLQTRD P PVLPTVS QLTQKAGEGEQVE VDGWDCS F 
PRDASRLEPSINVEPLSSLLAQFFSCVSCWDLRGSLL 

NVAANVTSRVAGRIiQNCCRAAANYCRSLQYQRRSSRG 
RDWGLLPIiLQPSSPSSLLSATPIPLPLAPFTQLTAAL 

LKVDGQKNCCEEGKEEQQGCAGDGGEDRVEEMVIEVG 
EMVQDWAMQS PGQPGDLPLTTGKHGAPGEEGQPSHAA 
LAERGPKGHEAAQEWSQGEAGKGASLPSSASWRCALW 
HRVWQGRRRARRRLQQQTKEGAGGGAGTRAGWLATEA 
QVTQELKGLSGGEERPETEPLLSFVASVSPADRMLTV 
TPLQDPQGLFPDLHHFLQVFL PQAIRHLK 


2325 


A 


3 


262 


SLSMCREVHVYE YI PSVRQTELCHYHELYYDAACTLG 
AYHPLLYEKLLVQRLNMGTQGDLHRKGKVA^jPGFQAV 
HCPAPSPVIPHS 


2326 


A 


241 


1449 


ASLCKGCFFVTHVLVTILPSLQSPPTFGFLLDIDGVIj 

XrDPUDXTT OA AT V2M?DT5TA7KTCnnnT.PTrD\n7P\7TNAOMT 

VRoHKVi.PAAijJ\Ar KKljvJNbyoyijKVr'V vr V liN/voriN J. 
LQHSKAQELSALLGCEVDADQVILSHSPMKLFSEYHE 
KRMLVSGQGPVMENAQGIjGFRNVVTVDELRMAF plld 
MVDIiERRLKTTPLPRNDFPRIEGVLLLGEPVRWETSL 
QLIMDVLLSNGS PGAGLATPPYPHLPVLASNMDLLWM 
AEAKMPRFGHGTFLLCLETIYQKVTGKELRYEGLMGK 
PS ILTYQYAEDLIRRQAERRGWAAPI RKLYAVGDNPM 
SDVYGANLFHQYLQKATHDGAPELGAGGTRQQQPSAS 
QSCI S ILVCTGVYNPRNPQSTE PVLGGGE PPFHGHRD 
LCFSPGLMEASHWNDVNEAVQLVFRKEGWALE 


o *a o n 


A 


£*±± 


1 AAQ 


ASLCKGCFFVTHVLVI ILPSLOS PPTFGFLLDIDGVL 
VRGHRVI PAALKAFRRLVNSQGQLRVP WF VTN AGN I 
LQHS KAQELS ALIjGCEVDADQVI LSHS PMKLFSEYHE 
KRMLVSGQGPVMENAQGLGFRNWTVDELRMAFPLLD 
MVDLERRLKTT PL PRNDF PRI EGVLLLGE P VRWETS L 
QLIMDVLLSNGSPGAGLATPPYPHLPVLASNMDLLWM 
AEAKMPRFGHGTFLLCLETI YQKVTGKELRYEGLMGK 
PS I LTYQYAEDLI RRQAERRGWAAP I RKL YAVGDN PM 
SDVYGANLFHQYLQKATHDGAPELGAGGTRQQQPSAS 
QSCISILVCTGVYWPRNPQSTEPVLGGGEPPFHGHRD 



2004/080148 



PCT/US2003/030720 



662 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


■ » j ■ T 

Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


A mlnn opSH cprt ii pnrp Ofs=TTnlrti/vwn *s=Stnn codon. 

/=possible nucleotide deletion,=possible nucleotide 
insertion) 










LCFSPGLMEASHVVNDVNEAVQLVFRKEGWALE 


2328 


A 


1 


359 


I SGES I YWSQKPTPS SNAS PWSE PAAVDVELTAYALL 
AQLTKPSLTQKE I AKATS I VAWLAKQRNAYGGF S STQ 

rvnnT'AT.i'^aT a K"VATTA WDQT?PTKTT .\rU"Tf GTTZMPORTF 
JJX VV A 1 >\Jr* ' lAMftl inl V lro.EiI2iXiN.LjV V IvO iC*iNryivi c 

NIQAVNRM 


2329 


A 


1 


359 


ISGESIYWSQKPTPSSNASPWSEPAAVDVELTAYALL 
AQLTKPSLTQKE I AKATS I VAWLAKQRNAYGGFS STQ 

nmn77\T/>AT ?\ W^TTA V\/DQT?PTKrTA7VTf QTPNTFORTI*' 

UX V VALiy/\Jj/vt\.Ix\l inl VfOEiIiXiNlJV VJXJD JL HiMXT v 1 ^- 1 - r 

NIQAVNRM 


2330 


A 


1 


359 


I SGES I YWSQKPTPS SNAS PWSEPAAVDVELTAYALL 
AQLTKPSLTQKEI AKATS I VAWLAKQRNAYGGFSSTQ 

Tvmnr a t a t . 7A 1TV7A TT A WP Q 1? 17 T MT .\7\IK GTRNFORTF 
XJX V ViVijyAii/vl^XAl Inl V rOCiolWJjv v rvo J. gin r -*■ c 

NIQAVNRM 


2331 


A 


1 


359 


I SGES I YWSQKPTPS SNAS PWSEPAAVDVELTAYALL 
AQLTKPSLTQKE I AKATS I VAWLAKQRNAYGGFS STQ 

TYT\ ATA T jT\ A T ■ A TTV ATTA WP^ PR TNT A/VKSTENFORTF 

NIQAVNRM 


2332 


A 


1 


359 


AQLTKPSLTQKE IAKATS I VAWLAKQRNAYGGFSSTQ 
DTWALQ ALAKYATTAYVPS EE INLWKSTENFQRTF 

Si X yn V IN ISJrl 


2333 


A 


21 


446 


MES AVRVES GVLVGWCLLLAC PATATGPE VAQ PEVD 
TTLGRVRGRQVGVKGTDRLVNVFLGI PFAQPPLGPDR 
FSAPHPAQPWEGVRDASTAPPMCLQDVESMNSSRFVL 
NGKQQI FSVSEDCLVLNVYS PAEVPAGSGRP 


2334 


A 


320 


171 


AASTTDGSYKCLCLPGYVPSDKPNYCTPLNTALNLEK 
CPFGLPHLSGSS 


2335 


A 


351 


49 


PASPPRWGCWGCWGRWDCFASRSPWARS*SRRPPRST 
7v a ADDODAonD r rr , ar , r* , TPPTWVTnRPARSRRSGRTPR 
AGR* K* S PGSGTRTSRPGGRRRPAGAR 


2336 


A 


3 


813 


rpTjTv QUMAurn a q CT? AMPT.VRTYTifJKDAGFDSEI FKRS 
TFGPSVEFTSVLKPVFAREKEPFSLSCLFSEDVLDAE 
g t nwRp nn Q T .T .R Q c; R R R TC T T . YTDROAS L KVS CT Y KED 
EGLYMVRVP S PFGPREQSTYVLVRDAEAENPGAPGS P 
LNVRCLDVNRDCLI LTWAPPSDTRGNPITAYTI ERCQ 
GESGEWIACHEAPGGTCRCPIQGLVEGQSYRFRVRAI 
SRVGSSVPSKASELWMGDHDAARRKTEI PFDLGNKI 

TTCSTHARRnTV 
llol L/rVT £iUX V 


2337 


A 


834 


628 


DIREYK*NNPLVHMRTDET*MTMK* *MVKEKKI VKED 
WRKVHLAS * QS F PS FFVI EHS KAI RGSWF PQL 


Z j JO 


TV 


O 


628 


DIREYK*NNPLVHMRTDET*MTMK* *MVKEKKI VKED 
WRKVHLAS *QSFPS FFVI EHS KAI RGSWF PQL 


2339 


A 


3 


449 


PGAPRVRLETHPEPLPSDTMVSSCCGSVCSDQGCGLE 
TCCRPSCCQTTCCRTTCCRPSCCVSSCCRPQCCQSVC 
CQPTCCRPSCCPSCCQTTCCRTTCCRPSCCVSSCCRP 
QCCQSVCCQPTCCRPSCSISSCCRPSCCVSRCCRSQR 
C 


2340 


A 


3 


449 


PGAPRVRLETHPEPLPSDTMVSSCCGSVCSDQGCGLE 
TCCRPSCCQTTCCRTTCCRPSCCVSSCCRPQCCQSVC 
CQPTCCRPSCCPSCCQTTCCRTTCCRPSCCVSSCCRP 
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ID 


Method 
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nucleotide 
location of 
first amino 
acid residue 
of peptide 
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Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possib!e nucleotide deIetion 9 =possible nucleotide 
insertion) 










QCCQSVCCQPTCCRPSCSISSCCRPSCCVSRCCRSQR 
C 


2341 


A 


3 


449 


PGAPRVRLETHPEPLPSDTMVSSCCGSVCSDQGCGLE 
TCCRPSCCQTTCCRTTCCRPSCCVSSCCRPQCCQSVC 
CQPTCCRPSCCPSCCQTTCCRTTCCRPSCCVSSCCRP 
QCCQSVCCQPTCCRPSCSISSCCRPSCCVSRCCRSQR 
C 


2342 


A 


38 


1435 


ACLICFRIGRGNCSRKICEEFLNPQILLTLELWTLA 
GK3FKCRCWTMLETLSRQWIVSHRMEMWLLILVAYMFQ 
RNVNSVHMPTKAVDPEAFMNI SEI IQHQGYPCEEYEV 
ATEDGYIIjSVNRI PRGLVQPKKTGSRPWLLQHGLVG 
GASNWI SNLPNNSLGFI LADAGFDVWMGNSRGNAWSR 
KHKTLSIDQDEFWAFSYDEMARFDLPAVINFILQKTG 
QEKI YYVGYSQGTTMGFI AFSTMPELAQKI KMYFALA 
PIATVKHAKSPGTKFLLLPDMMIKGLFGKKEFLYQTR 
FLRQLiVI YLCGQVI LDQI CSN IMLLLGGFNTNNMNMS 
RASVYAAHTLAGTSVQNILHWSQAVNSGELRAFDWGS 
ETKNLEKCNQPTPWYRWDMWPTAMWTGGQDWLSN 
PEDVKMLLSEVTNLI YHKNI PEWAHVDFIWGLDAPHR 
M YNE I I HLMHQEETQ PF PRT A 


2343 


A 


38 


1435 


ACLI CFRIGRGNCSRKI CEE FLNPQI LLTLELWTLA 
GKNKCRCWTMLETLSRQWIVSHRMEMWLLILVAYMFQ 
RNVNSVHMPTKAVDPEAFMNI SEI IQHQGYPCEEYEV 
ATEDGYI LS VNRI PRGLVQPKKTTGSRPVVLIjQHGLxAj 
GASNWI SNLPNNS LGF I LADAGFDVWMGNSRGNAWSR 
KHKTLSIDQDEFWAFSYDEMARFDLPAVINFILQKTG 
QEKI YYVGYSQGTTMGFI AFSTMPELAQKI KMYFALA 
PI ATVKHAKS PGTKFLLLPDMMI KGLFGKKE FLYQTR 
FLRQLVI YLCGQVI LDQICSNIMLLLGGFNTNNMNMS 
RAS VYAAHTLAGTS VQNI LHWSQAVNSGELRAFDWGS 
ETKNLEKCNQPTPVRYRVRDMTVPTAMWTGGQDWLSN 
PEDVKMLLSEVTNLI YHKNI PEWAHVDFIWGLDAPHR 
MYNEI IHLMHQEETQPFPRTA 


2344 


A 


91 


1042 


VTMYKDCIESTGDYFLLCDAEGPWGIILESLAILGIV 
VTILLLLAFLFLMRKIQDCSQWNVLPTQLLFLLSVLG 
LFGLAFAFI I ELNQQTAPVRYFLFGVLFALCFS CLIiA 
HASNLVKLVRGCVSFSWTTILCIAIGCSLLQI I IATE 
YVTLIMTRGMMFVNMTPCQLNVDFVVLLVYVLFLMAL 
TFFVSKATFCGPCENWKQHGRL I FI TVLFS III WWW 
I SMLLRGNPQFQRQPQWDDPWCI ALVTNAWVFLLLY 
T VPET.PT LYRSCROECPLOGNACPVTAYQHS FQVENQ 
ELSRDKWKVLLNSDFLSHSGA 


2345 


A 


2 


669 


AHTMVPEEEPQDREKGLWWVQVKVWSMAWSILLLSV 
CFTVSSVVPHNFMYSKTVKRLSKLREYQQYHSSLTCV 
MEGKDIEDWSCCPTPWTSFQSSCYFISTGMQSWTKSQ 
KNCSVMGADLWINTREEQDFI IQNLKRNSS YFLGLS 
DPGGRRHWQWVDQTPYNEN\SREYRMRFWHSGEPNNL 
DERCAIINFRSSEEWGWNDIHCHVPQKSICKMKKIYI 


2346 


A 


2 


669 


AHTMVPEEE PQDREKGLWWVQVKVWSMAWS ILLLSV 
C FTVS S WPHNFMYS KTVKRLS KLRE YQQ YHS S LTC V 
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/^possible nucleotide deletion,=possible nucleotide 
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MEGKDIEDWSCCPTPWTSFQSSCYFISTGMQSWTKSQ 
KNCSVMGADLWINTREEQDFIIQNLKRNSSYFLGLS 
DPGGRRHWQWVDQTPYNEN\SREYRMRFWHSGEPNNL 
DERCAI INFRS SEEWGWNDIHCHVPQKS I CKMKKI YI 


2347 


A 


1 


2093 


MLVLNSWAQVIHWPQPPKVLGLQPLEKTQYGFLGTDR 
VEEKTSVITIRVSVTHRHNSYMEAENLTELSKFLLLG 
LSDDPELQPVLFGLFIiSMYLVTVLGNLLI ILAVSSDS 
HLHTPMYFFLSNLS FVDI C F I STTVPKMLVS I QARSK 
DISYMGCLTQVYFLMMFAGMDTFLLAVMAYDRFVAIC 
HPLHYTVIMNPCLCGLLVLAS WFI I FWFSLVHI LLMK 
RLTFSTGTEI PHFFCEPAQVLKVACSNTLLNNI VLYV 
ATALLGVFPVAGILFSYSQIVSSLMGMSSTKGKYKAF 
STCGSHLCWSLFYGTGLGVYLSSAVTHSSQSSSTAS 
VMYAMVTPMLNPFIYSLRNKDVKGALERLLSRADSCL 
LRCPSYTEPQNLTGVSEFLLLGLSEDPELQPVLAGLF 
LSMYLVTVLGNLLIILAVSSDSHLHTPMYFFLSNLSL 
ADIGFTSTTVPKMIVDMQTHSRVISYEGCLTQMSFFV 
LFACMDDMLLS VMAYDRFVAI CHPLHYRI IMNPRLCG 
FLILLSFFISLLDSQLHNLIMLQLTCFKDVDISNFFC 
DPSQLLHLRCSDTFINEMVT YFMGAI FGCLPI SGILF 
SYYKIVSPILRVPTSDGKYKAFSTCGSHLAWCLFYG 
TGLVGYLS S AVLPS PRKSMVAS VMYTWT PMLN PF I Y 
SLRNKDIOS ALCRLHGRI I KSHHLHPFCYMG 


2348 


A 


773 


317 


QCTQKAAEGYTQFYYVDVLDGKLACVNKCTKGTKSQM 
NCNLGTCQLQRSGPRCLCPNTNTHWYWGETCEFNIAK 
SLVYGIVGAVMAVLLLALIILIILFSLSQ\RKRHRPE 
SEGEADFGLENATNNFG\PTLETVDSGTELHIQ\RPE 
MVASTV 


2349 


A 


55 


414 


MALTGYS WLLLS ATFLNVGAE I S I TLE PAQPS EGDNV 
TLWHGLSGELLAYSWYAGPTLSVSYLVASYIVSTGD 
ETPGPAHTXREAVRPDGSLDI QGI LPRHS STYI LQTF 
NRQLQTEVG 


2350 


A 


1 


790 


RGYNPNVNAGI INS FATAAFRFGHTLINPI LYRLNAT 
LGEISEGHLPFHKALFSPSRI IKEGGIDPVLRGLFGV 
AAKWRAPSYLLSPELTQRLFSAAYSAAVDSAATI IQR 
GRDHGIPPYVDFRVFCNLTSVKNFEDLQNEIKDSEIR 
QKLRKLYGSPGDIDLWPALMVEDLI PGTRVGPTLMC/ 
ML/ STQFQRLRDGDRFWYENPGVFTPAQLTQLKQASL 
SRVLCDNGDS IQQVQADVF/ RKRQEYPQDYLNCKRES 
PNVDPAKC 


2351 


A 


1 


790 


RGYNPNVNAGI I No FA 1AAJ? KroH 1 JjIin trxij I klinai 
LGEISEGHLPFHKALFSPSRI IKEGGIDPVLRGLFGV 
AAKWRAPSYLLSPELTQRLFS AAYS AAVDSAATI IQR 
GRDHGI PPYVDFRVFCNLTSVKNFEDLQNEIKDSEIR 
QKLRKLYGSPGDIDLWPALMVEDLI PGTRVGPTLMC/ 
ML/STQFQRLRDGDRFWYENPGVFTPAQLTQLKQASL 
SRVLCDNGDS IQQVQADVF/ RKRQEYPQDYLNCKRES 
PNVDPAKC 


2352 


A 


1 


671 


NFLPRRLLLTGPPQVGKTGSYLQFLRILFRMLIRLLE 
VDVYDEEEINTDHNESSEVSQSEGEPWPDIESFSKMP 
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/^possible nucleotide deletion,=possible nucleotide 
insertion) 










FDVSVHDPKYSIiMSLVYTEKLAGVKQEVIKESKVEEP 
RKRETVSIMLTKYAAYNTFHHCEQCRQYMDFTSASQM 
SDSTLHAFTFSSSMLGEEVQLYFIIPKSKESHFVFSK 
QGKHLESMRLPJjVSJJKWljNAvK£>F±r lfobb"ncinvjn 
V 


2353 


A 


2 


805 


RELHKEVEVAKRNLAQQKI I SEMESKLVEQQLAEENK 
LLKEQENMKELWNLLRMTQIKIDEKEQKSKDFLKAQ 
QKYTNI VKEMKAKDLEI RI HKKKKCE I YRRLRE FAKL 
YDTI RNERNKFVNLLHKAHQKVNE I KERHKMSLNELE 
ILRNSAVSQERKLQNSMLKHAtmVTIRESMQNDVRKI 
VS KLQEMKE KKEAQLNN I DRL ANT X TMI E E EMVQLRK 

r>vovi\Trr\UDKTl?DDr , T C DPMTT WTTP T?T .P^rVC^T? T TTRMT 

RYaKAVQHRWhiKKvjJjolrtjiyiJ. i ivUKr utf v xvxd j. a iru.N j. 
QLEKKLMGL 


2354 


A 


159 


1028 


MGLCVPFAVTTSFLSLGLEWDLNVRIiHGQHLVQQIiVL 
RTVRGYLETPQPEKALALS FHGWSGTGKNFVARMLVE 
NLYRDGLMSDCVRMFIATFHFPHPKYVDLYKEQLMSQ 
IRETQQLCHQTLFIFDEAEKLHPGLLEVLGPHIjERRA 
PEGHRAESPWTI FLFLSNLRGDI INEWLKLLKAGWS 

nnnrrrajmrrr nnm /^7\ T7T T rOTT n\TP T7T2T-TQP T A7Tf J7ATT .T D 

REE 1 TMEHLcj PHIjyAcj Ivhll UINor Vjjrto rtij v ruiiN u j. u 
YFI PFLPLE YRHVRLCARDAFLSQELLYKEETLDEI A 
QMMVYVPKEEQLFSSQGCKS I SQRINYFLS * 


2355 


A 


736 


17 


* RAMNFS I CFLE IGS I * TGRYCKTVLC KLRAVL * S FR 
VLNITKAYLVLFSSLYKNLICSSVRSVPLKKFLKSLS 
S I LRDRFFK* T *NPRGERERVLLGDFE * DRFRKCLSL 
IPLGGECSSDLLRTSPSLTALPPNSIHCCSDPCITSI 
NLEPIKLL*HLRPPEASTHEANFTMASPLFRPS*CFK 

KITPSTHlUrlixvlvlKlooor 1K~vjJV1t**JUNI\. oronrnvj 

LVFLGLKLPCPVPLV*NP 


2356 


A 


506 


1317 


GRTS SGKAGMWKPGAE S WPLHTGAAQVMWFEKL YAGL 
QCVEKYLIYPAVVLNALTVDAHTVVSHPDKYCFYCRA 
LLMTVAGLKLLRSAFCCPPQQYLTLAFTVLLFHFDYP 
RliSQGFLLDYFLMSLLCSKLWDLLYKLRFVLTYIAPW 
QITWGSAFHAFAQPFAVPHSAMLFVQALLSGLFSTPL 
NPLLGSAVFIMSYARPIiKFWERDYNTKRVDHSNTRLV 
TQLDRN PGAJJJJJn W Livt olr x cttiu ± xs.o u\in xj^\suxj v 
RWGN YGPGDC F 


2357 


A 


506 


1317 


GRTSSGKAGMWKPGAESWPLHTGAAQVMWFEKLYAGL 

nrnminrr Tvn7\mn M&T/Tn7T^AU T T^A7QHPnKYPFYCRA 
QCVEKYIjI X PiW VlaJNfvLiA V LJf\ri 1 v vonruMv-r ± ^iui 

LLMTVAGLKLIiRSAFCCPPQQYLTIAFTVLLFHFDYP 
RLSQGFLLDYFLMSLLCSKLWDLLYKLRFVLTYIAPW 
QITWGSAFHAFAQPFAVPHSAMLFVQALLSGLFSTPL 
NPLLGSAVFIMSYARPLKFWERDYNTKRVDHSNTRLV 
TQLDRNPGADDNNLNS I F YEHLTRSLQHTLCGDLVLG 
RWGNYGPGDCF 


2358 


A 


3 


301 


STATWAGVQWCNLSSLQPLPSGFKPFSCLSLPGSWDH 
RHIiPPCPANFLYCFFLVEMGFH YVGQAGLKLLT/ S / G 
DLCAS APQS AGSTGVNHRVRLGLLI YI P 


2359 


A 


326 


1379 


PEPHAVQCAELRHQQPRDPQRLQQDGSADAPAERKPH 
CGGERAHGSG\ FLAMLLVLGLCGAAYRPTEE I DLRSV 
GWGNI FQLPFKHVRDYRLRHLVPFFI YSGFE VLFACT 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possible nucleotide 
insertion) 










GIALGYGVCSVGLERLAYLLV\AYSLGASAASLLG\L 
LGLWLPRPVPLVAGAGVHLLLTFI LFF \ WAPVPRVLQ 
HSWILYVAAALWGVGSALNKTGLSTLLGILYEDKERQ 
DFIFTIYHWWQAVAIFTVYLGSSLHMKAKLE\VLLVT 
LVAAAVS YLRMEQKLRRGVAPRQPR \ I PRPQHKVRG\ 
YRYLQAHNSDESDPEGEHADAAQEEAPPAGPRPGP\E 
PAGLGRRPCPYEQAQGGD\GPEEQ 


2360 


A 


2 


1397 


LRAGEDMAAS AS AAAGEEDWVLPSE VE VLES I YLDEL 
QVI KGNGRTS PWEI YITLHPATAEDQDSQYVCFTLVL 
QVPAE YPHEVPQI S I RNPRGLSDEQIHTI LQVLGHVA 
KAGLGTAMLYELIEKGKEILTDNNI PHGQCVICLYGF 
QEKEAFTKTPCYHYFHCHCLARYIQHMEQELKAQGQE 
QEQERQHATTKQKAVGVQCPVCRE PLVYDLAS LKAAP 
EPQQPMELYQPSAESLRQQEERKRLYQRQQERGGIID 
LEAERNRYFISLQQPPAPAEPESAVDVSKGSQPPSTL 
AAELSTS PAVQSTLPPPLPVATQHI CEKI PGTRSNQQ 
RLGETQKAMLDPPKPSRGPWRQPERRHPKGGECHAPK 
GTRDTQELPPPEGPLKEPMDLKPEPHSQGVEGPPQEK 
GPGS WQGP P PRRTRDC VRWERS KGRT PGS S Y PRL PRG 
QGAYRPGTRRESLGLESKDGS 


2361 


A 


718 


305 


SEQEPLLGDTPGSREWDILETEEHYKSRWRSIRILYL 
TMFLSSVGFSWMMSIWPYLQKIDPTADTSFLGWVIA 
SYSLGQMVASPIFGLWSNYRPRKEPLIVSILISVAAN 
CLYAYLHI PASHNKYYMLVARGLLGIG 


2362 


A 


169 


879 


MTAEFLSLLCLGLCLGYEDEKKNEKPPKPSLHAWPSS 
WEAESNVTLKCQAHSQNVTFVIiRKVNDSGYKQEQSS 
AENEAEFPFTDLKPKDAGRYFCAYKTTAS HEWS ESSE 
HLQLWTDKHDELEAPSMKTDTRTI FVAI FSCISILL 
LFLSVF 1 1 YRCSQHS S SSEESTKRTSHSKljPEQbAAi!. 
ADLSNMERVSLSTADPQGVTYAELSTSALSEAASDTT 
OEPPGSHEYAALKV* 


2363 


A 


169 


879 


MTAEFLSLLCLGLCLGYEDEKKNEKPPKPSLHAWPSS 
VVEAESNVTLKCQAHSQNVTFVLRKVNDSGYKQEQSS 
AENEAEFPFTDLKPKDAGRYFCAYKTTASHEWSESSE 
HLQLWTDKHDELEAPSMKTDTRTI FVAI FSC I S ILL 
LFLSVFIIYRCSQHSSSSEESTKRTSHSKLPEQEAAE 
ADLSNMERVSLSTADPQGVTYAELSTSALSEAASDTT 
QEPPGSHEYAALKV* 


2364 


A 


43 I 


369 


AAAWGLAAWGEGPTDATSCWEVGAGGPGNSRPNQTVS 
MDLNSASTVVLQVLTQATSQDTAVLKPAEEQLKQWET 

ADPDVCTn t MTT?TTvTT4^T.nT\n/T?WT, AVT.YPKMnTnR 
X O V Aj J-iN -L r HMnXJJUXJX vi\.rvjjrt.vjji x: xuiuxwi^ 


2365 


A 


4272 


1534 


CHGLQHLTPFRELNLSLQG*EPH*AA*QAVRSEEKSI 
C * GS PSCHLVLGVLVPVARQS SHS AGPAQSAFR* TGT 
GSGTPKAAEQSGYWEAYTLGHQHWNMFPIQRPPLVMK 
GRRIMCGKCEKG*VSDSVTGGRAVAGEQASQRRTVFT 
AGGGECLGAKSVRASVFTGMQPGVMGLLNGKRGGCFE 
SGYLFGFIVIGKIQSLEAKVPLPVNGQTGERASPGNC 
RIHI VDAVC * SEHH* DHFLAAAFLENSTI I S * VAPGS 
WQDHAVLQKEVQAS VRCRGFE S VDTAPAGFWAHS PPG 
LQGE PTTTSVSLFVLAPQDGEGVPFVEGQLVTVLGLV 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










VPQSIRHTFVHHTQLFLHPI * KLGALDVAFLHLLTLV 
CS S FNVAYG * GKNGGTTLHQLFAE VNAVTRGS AVQRR 
PSITISSI HVDTKI QQELHDVMVAGADGWQ WGD P FV 
VGLAGI FHL I DDPLHQ I ELS FQRRV * EQCQGVKPDSQ 
PVPRPLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLF 
PWRWGLSHRTRDLLRGGDRGHVWIVLCRLGSLVGGL 
GTDELLWFGGR* LI I IGI * * RGRLSGE WGCGLGRGE L 
FQVS IGIGVS I VHIGQGDHEVLGGAGLVERGALHATG 
QGVEALVQQLLDVGPAGALGLCDGAALFQGPGRVGQL 
PAEGLQVCI TLVAQWRMHDGRELGGAEWPWQALHGAA 
I CGVGGAI LLKALSQ YFLKGG * RLWCARGQ * PVKKRQ 
RRWRG*TRR *NGLTIHCFN * LI * GAVCCRLVI LRWCG 
LLEVHGVYGT* IHCLGSFPGRLWP* PFISQERPNGHC 
QWE FRLAVP S WKCRWSRWRVRGTWRYGNPLLNLL*GA 
WLGGAACGGQQGGPLSTWQACTGPGQAAFLPPFQGAC 
RPRTQRCRTWVC P I AWRQLLAYTRD 


2366 


A 


193 


366 


MYGMLEWPISMYFVAFLHCFLCSGGNLGDSFQALPEL 
CANCSSSPRVLCCWMSPLP* 


2367 


A 


1038 


1402 


YYQISSLPS I VGNGI FLWLLI CI FLAKQGGSRL* FQP 
FGRPRGGGHLRSGVLGQPGQHGETP/SFFYNSKISPA 
LWGPPVI PSALGGEAGKSL* PRRQRFQRGGI APLPSR 
VRGRAKLFLKKK 


2368 


A 


480 


226 


MHFLATFALFFI FGVFFLFAVLTNLLLAEEWI RGGN 
FLGSFLVHTLFLDQVPGEITHDSHLVLAITINTASPK 
FSSSIFFYQL* 


2369 


A 


259 


941 


PVSWSLNSCRFFFFF*DQSLPSW/QAGSGQ*RNLDS 
L\QPLASRFK* FSSSRLL\ SS W\ D YRHMATMARL I F I 
FLVEMGF\TMLARLVLNFLTSSDPPTSAFPKWLGLQG 
VKPNTRAVGFN* * LGYYS 1 1 LYHSNS PGTDLVF I LF I 
YLFTYLFLRQEQNSAAQARVQ * WHNLGSLQS PPPGV\ 
H*FLCLSLPSSWDYRCAPPHQANFFIFSRDGVSPCWP 
GWS*TPDLR 


2370 


A 


1676 


1197 


MALRHLALLAGLLVGVAS KSMENTAQLPE CCVD WGV 
NAS C PGASLCGPGC YRRWNADGS AS CVRCGNGTLPAY 
NGS ECKS FAGPGAPF PMNRS S GTPGRPHPGAPRVAAS 
LFLGTFFISSGLILSVAGFFYLKRSSKLPRACYRRNK 
APALQPGERLQ* 


2371 


A 


1078 


594 


VGMELPAVNLKVI LLGHWLLTTWGC I VFSGS YAWANF 
T I LALG VWA VAQRD S I DAI SM FLGGLLAT I F LD I VH I 
S I FYPRVSLTDTGRFGVGMAI LSLLLKPLSCCFVYHM 

AVPEGRSQDARGY 


2372 


A 


3 


517 


HEGRELETGQGRQSSVGAAQGTGVRAGVRAGTTQSGR 
RRARVSGRL AE VSMAS VAWAVLKVLLLLPTQTWS PVG 
AGNPPDCDAPLASALPRSSFSSSSELSSSHGPGFSRL 
NRRDGAGGWTPLVSNKYQWLQI DLGERME VTAVATQG 
GYGSSDWVTSYLLMFSDGGRNWK 


2373 


A 


3 


517 


HEGRELETGQGRQS S VGAAQGTGVRAGVRAGTTQSGR 
RRARVSGRLAEVSMAS VAWAVLKVLLLLPTQTWS PVG 
AGNPPDCDAPLASALPRSS FS SS SELS S SHGPGFSRL 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deIetion,=possible nucleotide 
insertion) 










NRRDGAGGWTPLVSNKYQWLQ I DLGERME VTAVATQG 
GYGSSDWVTSYLLMFSDGGRNWK 


2374 


A 


2 


1078 


GRVGWELWCMYI S PPKDWWDAGD PSLPI RTPAMI GC S 
FVVNRKFFGEIGLLDPGMDVYGGENIELGI KVWLCGG 
SMEVLPCSRVAHIERKKKPYNSNIGFYTKRNALRVAE 
VWMDDYKSHVYIAWMLPLENPGIDIGDVSERRALRKS 
LKCKNFQWYLDHVYPEMRRYNNTVAYGELRNNKAKDV 
CLDQGPLENHTAILYPCHGWGPQLARYTKEGFLHLGA 
LGTTTLLPDTRCLVDNS KSRL PQLLDCDKVKS S L YKR 
WNFIQNGAIMNKGTGRCLEVENRGLAGIDLILRSCTG 
QRWTI KNS I K* REGAGALE PGPQDMAAPPNI WTSC PG 
GETARGRQVLDGPPRASPGQHRDPG 


2375 


A 


2 


630 


ESNSRCRKMPGERCRGGPARLSLLLDLPTRPLPHPRQ 
VIDFGSASIFSEVRYVKEPYIQSRFYRAPEIIiLGLPF 
CEKVDVWSLGCVMDEIiHLGWPLYPGNNEYDQVRYI CE 
TQGLPKPHLLHAACKAHHFFKRNPHPDAANPWQLKSS 
ADYLAETKVRPLERRKYMLKSLDQIETVNGGSVASRL 
TFPDREALAEHADLKSMVELI SAC 


2376 


A 


77 


273 


PRTGMGCCLPGADPAEIRSSPSPSWSTAGSQGCWMTS 
. FSPCSCAPCCSSGCACTTGFVSREKESV 


2377 


A 


1164 


464 


APWPLPLLRSPQSRPHSLGSLFPSLPGLAELDLQRTL 
SI^APPVKEGPLFIHRTKGKGPLMSSSFKKLYFSLTT 
EALSFAKTPSSKCVNELNQWLSALRKVSINNTGLLGS 
YHPGVFRGDKWSCCHQKEKTGQGCDKTRSRVTLQEWN 
DPLDHDLEAQL I YRHLLGVEAMLWERHRELSGGAEAG 
TVPTS PGKVPEDSLARLLRVLQDLREAHSSS PAGS PP 
SEPNCLLELQT 


2378 


A 


706 


951 


MRCGWGPLGCIiGTGAPAGWMVLGS PRSQLQRARWS RA 
SLSAFGWEIRLRPEGPKAPRQLLLVALESETLGVHGG 
ATPLHCL* 


2379 


A 


2 


456 


CVNTFGSYICKCHKGFDLMYIGGKYQCHDIDECSLGQ 
YQCS S FARC YNVRGS YKCKCKEGYQGDGLTCVYI PKV 
MI E PSGPIHVPKGNGTI LKGDTGNNNWI PDVGSTWWP 
PKTPYI PPI ITNRPTSKPTTRPTPKPTPI PTPPPPPR 
IPP 


2380 


A 


3 


1435 


LRRHFFFPPSFPPLLLPSLPLSSPLSSFPPRSAGACW 
GERLVLQALALRGRPAGSWRGEEAGTAMAPQKHGGGG 
GGGSGPSAGSGGGGFGGS AAVAAATASGGKSGGGS CG 
GGGS YSAS S S S SAAAAAGAAVL PVKKPKMEHVQADHE 
LFLQAFEKPTQIYRFL*TRNLIAPIFLHRTLTYMSHR 
NSRTNI KRKTFKVDDMLS KVE KMKGEQESHSLS AHLQ 
LTFTGFFHKNDKPSPNSENEQNSVTLEVLLVKVCHKK 
RKDVS CPI RQVPTGKKQVPLNPDLNQTKPGNF PS LAV 
S SNE FE PSNS HMVKS YSLLFR VTRPGRRE FNGM I NGE 
TNENIDVNEELPARRKRNREDGEKTFVAQMTVFDKNR 
RLQLLDGE YE VAMQEMEECP I SKKRATWETI LDGKRL 
PPFETFSQGPTLQFTLRWTGETNDKSTAPIAKPLATR 
NSESLHQENKPGSVKPTQTIAVKESLTTDLQKK 


2381 


A 


20 


1748 


KPFNVGLSLNKTERLQLSHGGCKARTAVRAGVFYRAV 
LQPLTLAQGGLPGGSGK/EGSSGCAGTDVGEQASGHR 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possible nucleotide 
insertion) 










ALS *QAVTPAPS *MGHPLSGS * GHQLEPQAGTS PNFA 
LVTLGHSRPQFPQL* GEALGRRGWPQPVS * * PGVS IR 
ET* EAARRGS AS ARQGRPSS * QGTC * I * RT/ AGVKKT 
PAGQAREGQL * GGTAACGAVGPERVGI S PS \QEHGPG 
GRRGVRVDKDTPAESHPHSI PSNKGTPSRKPAVFPGA 
PVPPSLTPLSHKATLPSSLTGGRGGGGGKADCSGEPG 
CPVLCQQMPPFHLPLAPASDHPGSAPGLQPPQRKPEG 
LPGRCRSDPSGVPTAPESGPGPGEPRP/GTQDALWP 
CLGPCSGPSQDLGSGGTCGSLCSRHHPPLPRPT*VAS 
S*GQAGLSFAHPSPP/SRAELGQDANATPPSA*R/GS 
PAQRGINNWGGPVGGAGWAR/ PGQEATPAGTEYG*DC 
PS VGS PQAQDGGQGRRCEGGG\ PGPW * HH* AHS PCGA 
AGCWPRCRRS S AADQRAAQGAP PCAGTGAARRARVRC 
PAGAAGS AAARTRNRPAG* QSAP PGRTRGS 


2382 


A 


84 


428 


MSERVERNWSTGGWLLALCLAWLWTHLTLAALQPPTA 
TVLVQQGTCEVI AAHRCCNRNRI EERSQTVKCS CFSG 
QVAGTTRAKPSCVDDLLLAAHCARRDPRAALRLLLPQ 
PPSS 


2383 


A 


84 


428 


MSERVERNWSTGGWLLALCLAWLWTHLTLAALQPPTA 
TVLVQQGTCEVIAAHRCCNRNRI EERSQTVKCS CFSG 
QVAGTTRAKPSCVDDLLLAAHCARRDPRAALRLLLPQ 
PPSS 


2384 


A 


1919 


3044 


HQGPSTPPSWAMSGPPTPLSREDWHQGPSTPPSWAMS 
E PPT/ S S I QGLASGAVHTI LLGDVRATYTS I QGVTSG 
VSQVSRAAQMAVP S SR I LQLS KPKAPATLLE \ E WDPV 
PKPKPHVSDHNRLLHLAKVPRKEGSGKKVGAFPEIKG 
PEAFRDKARAMESQSNDMPFDELLALYGYEASDPISD 
RESEGGDVDPNLPDMTLDKEQIAKDLLSGEEEEETQS 
SADDLTPSVTSHEASDLFPNRSGCLLAGEAESSRGLL 
PRAQPVPRGAGLADNSRGALLRAHGTVRVGTTATVKP 
ADAPPESPRDRRSRNDSHRPTGPSESERQPQSNQPTL 
LLRGHGTIRVRTTATVKPADAPAES PRDRRSRNDSHG 
QSSRRSC 


2385 


A 


1206 


2266 


RHLLTIFHKLKIYKTINKIDFKKKRVTQLLVFCLFLC 
LFFSSEMVKNQTMVTEFLLLGFLLGPRIQMLLFGLFS 
LFYVFTLLGNGTILGLISLDSRLHTPMYFFLSHLAW 
NI AYACNTVPQMLVNLLHPAKPI SFAGCMT* TFLFLS 
FAHTECLLLVLMSYDRYVAICHPLRYFI IMTWKVCIT 
LAITSWTCGSLLAMVHVSLILRLPFCGPREINHFFCE 
I LSVLRLACADTWLNQWI FAACMF I LVGPLCLVLVS 
YSHILAAILRIQSGEGRRKAFSTCSSHLCWGLFFGS 
AIVMYMAPKSRHPEEQQKVLFLFYSSFNPMLNPLIYN 
LRNVEVKGALRRALCKESHS 


2386 


A 


1206 


2266 


RHLLTI FHKLKI YKTINKIDFKKKRVTQLLVFCLFLC 
LFFSSEMVKNQTMVTEFLLLGFLLGPRIQMLLFGLFS 
LFYVFTLLGNGTILGLISLDSRLHTPMYFFLSHLAW 
NI AYACNTVPQMLVNLLHPAKPI S FAGCMT * TFLFLS 
FAHTECLLLVLMSYDRYVAICHPLRYFI IMTWKVCIT 
LAITSWTCGSLLAMVHVSLILRLPFCGPREINHFFCE 
I LSVLRLACADTWLNQWI FAACMF I LVGPLCLVLVS 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,^ ossible nucleotide 
insertion) 










YSHILAAILRIQSGEGRRKAFSTCSSHLCWGLFFGS 
AIVMYMAPKSRHPEEQQKVLFLFYSSFNPMLNPLIYN 
LRNVEVKGALRRALCKESHS 


2387 


A 


176 


371 


HFYFCFSDINLAAEPKVNRGKAGVKRSAAEMYGSVTE 
HPS PS PLLRSGTLLF I TALC P S VGI FS F 


2388 


A 


3870 


3673 


NTQCIPEGLESYYAEQDSSAREKFYTVINHYNLAKQS 
ITRSVSPWMSVLSEEKLSEQETEAAEKSA 


2389 


A 


1 


542 


SGSSHASDGSGFQELRICSEDQTPLIAGMCSLPMARY 
YIIKYADQKALYTRDGQLLVGDPVADNCCAEKICTLP 
NRGLDRTKVP I FLGIQGGSRCLACVETEEGPSLQLED 
VNIEELYKGGEEATRFTFFQSSSGSAFRLEAAAWPGW 
FLCGPAEPQQPVQLTKESEPSARTKFYFEQSW 


2390 


A 


3 


569 


ILNERLANYLQKVRMLERENAELESKIQEESNKELPV 
LCPDYLSYYTTIEELQQKILCTKAENSRLVSQIDNTK 
LTADDLRAKYEAEVSLRQLVESDANGLKQI LNVLTLG 
KADLEAQVQSLKEELLCLKNNHKEEINSLQCQLGERL 
DIEVTAAPSADLNQVLQEMRCQYEPIMETNRKDVEQW 
FNTQ 


2391 


A 


3 


581 


GRRLRSEPRPARPPIARAWPPAPGADGRARRTRVPAP 
CLPRAPCYGVRPRAWRPRPARLRGGLVRWLLSGGPQP 
RRPRATERPSAGTGAAPRRTEPRGRCRGCGRGRG*GP 
RAWGLALCS PHS C SGAAWGPTTGS QRSWPAVARS WQG 
DS SRC P ALRTTTVTAGS KAAL PE S AAE VS PMS S S PGR 
KRSGFAA 


2392 


C 


175 


454 


MGSLCFLPSLQYWCDELKVEXKTQGRGFPLPGSPASA 
SHASWTALVKGVGSGQAQEAEGSEEQE IGES PGQSQG 
VAGAGLGLNEGQVPRMXTR 


2393 


A 


157 


396 


GGGWTSCSVRFLEQQNQVLETKWELLQQLDLNNCKNN 
LEPILEGYISNLRKQLETLSGDRVRLDSELRSVRDW 
EDYKKR 


2394 


A 


126 


561 


WKMKKMCNWLRI INYTPDMARAAVDEAIQEGLEVWSK 
VTPLKFTKI S KGI ADIMI AFRTRVHGRCPRYFDGPLG 
VLGHAFPPGPGLGGDTHFDEDENWTKDGADLHDNS PF 
YGHDGCLAHAFPPGPGIGGDVHFDNDETRTKDFR 


2395 


A 


126 


561 


WKMKKMCNWLRI INYTPDMARAAVDEAIQEGLEVWSK 
VTPLKFTKI S KGI ADIMI AFRTRVHGRCPRYFDGPLG 
VLGHAFPPGPGLGGDTHFDEDENWTKDGADLHDNSPF 
YGHDGCLAHAFPPGPGIGGDVHFDNDETRTKDFR 


2396 


A 


1 


1452 


MAELRPSGAPGPTAPPAPGPTAPPAFASLFPPGLHAI 
YGECRRLYPDQPNPLQVTAIVKYWLGGPDPLDYVSMY 
RNVGS PS AN I PEHWHYI S FGL SDLYGDNRVHEFTGTD 
GPSGFGFELTFRLKRETGESAPPTWPAELMQGLARYV 
FQSENTFCSGDHVSWHSPLDNSESRIQHMLLTEDPQM 
QPVQTPFGWTFLQI VGVCTEELHS AQQWNGQGI LEL 
LRTVPIAGGPWLITDMRRGETIFEIDPHLQERVDKGI 
ETDGSNLSGVS AKCAWDDLSR P PEDDEDSRS I C IGTQ 
PRRLSGKDTEQIRETLRRGLEINSKPVLPPINPQRQN 
GLAHDRAPSRKDSLESDSSTAI I PHELIRTRQLESVH 
LKFNQESGALI PLCLRGRLLHGRHFTYKS I TGDMAI T 
FVSTGVEGAFATEEHPYAAHGPWLQILLTEEFVEKML 



WO 2004/080148 



PCT/US2003/030720 



671 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X^Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










EDLEDLTSPEEFKLPKEYSWPEKKLKVSILPDWFDS 
PLH 


2397 


A 


126 


434 


MCTKTI PVLWGCFLLWNLYVS SSQTI Y PGI KARI TQR 
ALDYGVQAGMKMI EQMLKEKKLPDLSGSESLEFLKVD 
YVNYNFSNIKISAFSFPNTSLAFVPGVGI 


2398 


A 


1489 


290 


FRPLATE PRGS S PVQLVS S TMS VRTLPLLFLNLGGEM 
LYILDQRLRAQNIPGDKARKVLNDI ISTMFNRKFMEE 
LFKPQELYSKKALRTVYERLAHASIMKLNQASMDKLY 
DLMTMAFKYQVLLCPRPKDVLLVTFNHLDT I KGF I RD 
SPTILQQVDETLRQLTEIYGGLSAGEFQLIRQTLLIF 
FQDLHIRVSMFLKDKVQNNNGRFVLPVSGPVPWGTEV 
PGLIRMFNNKGEEVKRIEFKHGGNYVPAPKEGSFEFY 
GDRVLKLGTNMYS WQPVETHVSGS SKNLASWTQES I 
APNPLAKEELNFLARLMGGMEIKKPSGPEPGFRLNLF 
TTDEEEEQAALTRPEELSYEVINIQATQDQQRSEELA 
RIMGEFE I TEQPRLSTS KGDDLLAMMDEL 


2399 


A 


1489 


290 


FRPLATEPRGSSPVQLVSSTMSVRTLPLLFLNLGGEM 
LYI LDQRLRAQN I PGDKARKVLND IIS TMFNRKFMEE 
LFKPQELYS KKALRTVYERLAHAS IMKLNQASMDKLY 
DLMTMAFKYQVLLCPRPKDVLLVTFNHLDTI KGFIRD 
SPTILQQVDETLRQLTEIYGGLSAGEFQLIRQTLLIF 
FQDLHIRVSMFLKDKVQNNNGRFVLPVSGPVPWGTEV 
PGLIRMFNNKGEEVKRIEFKHGGNYVPAPKEGSFEFY 
GDRVLKLGTNMYSWQPVETHVSGSSKNLASWTQESI 
APNPLAKEELNFLARLMGGME I KKPSGPE PGFRLNLF 
TTDEEEEQAALTR PEELS YE VINIQATQDQQRSEELA 
RIMGEFE I TEQPRLSTS KGDDLLAMMDEL 


2400 


A 


1214 


1357 


NKINMFIAALFTIAKT\WNQPK\CPTMIDWIKKRGSS 
RVASSSSPTRTR 


2401 


A 


85 


396 


MILINFRE I CLKVLHTPLCVSGGCVLLYI LALTCCYT 
NSLLISHLPPLSLPTETQTHLFMYRVLKVRKDIKNHV 
FHPTYLVAKETETYGEELI PLPPCREHQD* 


2402 


A 


919 


1439 


KLKDFFFEMEYCSVAQAGVQWSLQPPSPWFKQFSYVS 
LPSSWDYSHLPPCPANLFLVEMRFHLVGQAGLKLLTS 
GDPPASASRSAGI IGVSHHAWPKI KRFYETKWLPILS 
IQLLSGLFIWALLFFCFVLHFCSIIWGNSLEVFPESV 
CRHNKICVLCTQKHNVS YES I TQPV 


2403 


A 


74 


226 


MSSWPRMLAHCFYLLKALSSSYLIKEMTIMPGTLLST 
LCILTHLNLPTPL* 


2404 


A 


255 


369 


PTESAPGLGFCFPDFGQSLPNEKQTSAI\LSDHQQSQ 
LC 


2405 


A 


5671 


1873 


GREREEELQ WRRRRRQRRGAAAPAAPAGGI EAVNMAS 
ASYHI SNLLEKMTSSDKDFRFMATNDLMTELQKDS I K 
LDDDSERKWKMILKLLEDKNGEVQNLAVKCLGPLVS 
KVKEYQVETIVDTLCTNMLSDKEQLRDISSIGLKTVI 
GELPPASSGSALAANVCKKITGRLTSAIAKQEDVSVQ 
LEALDIMADMLSRQGGLLVNFHPSILTCLLPQLTSPR 
LAVRKRTI I ALGHLVMSCGNI VFVDLI EHLLSELSKN 
DSMSTTRTYIQCI AAI SRQAGHRIGEYLEKI I PLWK 
FCNVDDDELREYCIQAFESFVRRCPKEVYPHVSTIIN 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










ICLKYLTYDPNYNYDDEDEDENAMDADGGDDDDQGSD 
DEYSDDDDMSWKVRRAAAKCLDAWSTRHEMLPEFYK 
TVS P \ ALI SRFKEREENVRADVFHAYLSLLKQTRPVQ 
SWLCDPDAMEQGETPLTMLQSQVPNIVKALHKQMKEK 
SVKTRQCCFNMLTELVNVLPGALTQHI PVLVPGI I FS 
LND KS S S SNLKI DAL S CL YVI LCNHS PQVFH PHVQ AL 
VP P WAC VGD P F YKI TSEALL VTQQL VKVI RPLDQ PS 
SFDATPYIKDLFTCTIKRLKAADIDQEVKERAISCMG 
QI ICNLGDNLGSDLPNTLQI FLERLKNEITRLTTVKA 
LTLI AGS PLKI DLRPVLGEGVPI LAS FLRKNQRALKL 
GTLSALDILIKNYSDSLTAAMIDAVLDELPPLISESD 
MHVSQMAISFLTTLAKVYPSSLSKISGSILNELIGLV 
RS PLLQGGALS AMLD F FQAL WTGTNNLGYMDLLRML 
TGPVYSQSTALTHKQS YYS I AKCVAALTRAC PKEGPA 
WGQFIQDVKNSRSTDS IRLLALLSLGEVGHHI DLSG 
QLELKSVILEAFSSPSEEVKSAASYALGSISVGNLPE 
YLPFVLQE I TSQPKRQ YLLLH SLKE IIS S AS WGLKP 
YVENIWALLLKHCECAEEGTRNWAECLGKLTLIDPE 
TLL PRLKGYL I SGS S YARSS WTAVKFT I S DHPQ P I D 
PLLKNCIGDFLKTLEDPDLNVRRVALVTFNSAAHNKP 
SLIRDLLDTVLPHLYNETKVRKELIREVEMGPFKHTV 
DDGLDIRKAAFECMYTLLDSCLDRLDIFEFLNHVEDG 
LKDHYDI KMLTFLMLVRLSTLCPSAVLQRLDRLVE PL 
RATCTTKVKANSVKQEFEKQDELKRSAMRAVAALLTI 
PEAEKSPLMSEFQSQISSNPELAAIFESIQKDSSSTN 

LESMDTS 


2406 


A 


1 


824 


THACALISSRFIILSSFHVILNKTKHTCIHTHSLTLK 
MQDEERYMTLNVQSKKRSSAQTSQLTFKDYSVTLHWY 
KI LLGI SGTVNGI LTLTLI SLI LLVSQGVLLKCQKGS 
CSNATQYEDTGDLKVNNGTRRNI SNKDLCASRSADQT 
VLCQSEWLKYQGKCYWFSNEMKS WSDb x V x CliiiKivbii 
LLIIHDQLEMSLV\QF*AFIQKNLRQLNYVWIGLNFT 
SLKMTWTWVDGSPIDSKI FFI KGPAKENSCAAI KESK 
IFSETCSSVFKWICQY 


2407 


A 


182 


418 


MCCELLAWI ATL 1 1 KI GL WLL YF I KLL I HI E F I KR 
HS I LKCES I FNLNVGIRMYPGQVNFCETLQMLDGFGR 
IFQTK 


2408 


A 


65 


320 


LQMS S LPTAAP ALDVDWQS STTF AS C STDMC I HVCRL 
GCDRP VKTFQGHTVS ESS CHWS RVCENVMWE P I L VCL 
ELKATAAADQL 


2409 


A 


923 


358 


YRASQFTIVLEVSVGPPGGSGTGSSGPTHHLPPPPAC 
QDEGSQGTDAPTPGNAENE P PE KETLS P PRRT PAPPE 
\PGSP\APGEGPSGRKRRRVPRDGRPAGNALTPELAP 
VQIKVEEDFGFEADEALDSSWVSRGPDKLLPYPTLAS 
PAFD 


2410 


A 


923 


358 


ALSCGPFPQPLGDKLFRWWLLPLSRFLMRVLDSYGDD 
YRASQFTIVLEVSVGPPGGSGTGSSGPTHHLPPPPAC 
QDEGSQGTDAPTPGNAENE PPEKETLSPPRRTPAPPE 
\PGSP\APGEGPSGRKRRRVPRDGRPAGNALTPELAP 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,— possime nucieonae 
insertion) 










VQI KVEEDFGFEADEALDSSWVSRGPDKLLPYPTLAS 
PAFD 


2411 


A 


923 


358 


ALS CGP FPQ PLGDKLFRWWLLPLSRFLMRVLDS YGDD 
YRASQFTIVLEVSVGPPGGSGTGSSGPTHHLPPPPAC 
QDEGSQGTDAPTPGNAENEPPEKETLSPPRRTPAPPE 
\PGSP\APGEGPSGRKRRRVPRDGRPAGNALTPELAP 
VQIKVEEDFGFEADEALDSSWSRGPDKLLPYPTLAS 
PAFD 


2412 


A 


12 


1154 


GI LRQKEREERNRI HKKEILFLEHLLWPSEMS SLSG 
KVQTVLGLVEPSKLGRTLTHEHIiAMTFDCCYCPPPPC 
QEAISKEPIVMKNLYWIQKNAYSHKENLQLNQETEAI 
KEELLYFKANGGGALVENTTTGI SRDTQTLKRLAEET 
GVHI ISGAGFYVDATHSSETRAMSVEQLTDVLMNEIL 
HGADGTSI KCGI IGE IGCSWPLTESERKVLQATAHAQ 
AQLGC PVI I HPGRS SRAPFQI IRILQEAGADI SKTVM 
SHLDRTILDKKELLEFAQLGCYLEYDLFGTELLHYQL 
GPDIDMPDDNKRIRRVRLLVEEGCEDRILVAHDIHTK 
TRIiMKYGGHGYSHILTNWPKMLLRGITENVIiDKILI 
ENPKQWLTFK 


2413 


A 


575 


759 


SVYSASSCKCCNYRKTEQIPDCEQPPASSMPERPSHE 
SQPTPQMMPLSAPSRAEELGQRPG 


2414 


A 


131 


1677 


VRGDDLTRALRARRRRSGSGSNFRWE PQATGILLFL 
PPPPVCPAPLPLSLLFPAPPAKMNSSDEEKQLQLITS 
LKEQAIGEYEDLRAENQKTKEKCDKIRQERDEAVKKL 
EEFQKI SHMVI EEVNFMQNHLEI EKTCRESAEAIiATK 
LNKENKTLKRI SMLYMAKLGPDVI TEE INIDDEDSTT 
DTDGAAETCVSVQCQKQIKELRDQIVSVQEEKKILAI 
ELENLKSKLVEVIEEVNKVKQEKTVLNSEVLEQRKVL 
EKCNRVSMLAVEE YEEMQVNLELE KDLRKKAES FAQB. 
MFIEQNKLKRQSHLLLQSSI PDQQLLKALDENAKLTQ 
QLEEERIQHQQKYKhJjfcriy JjliiNii J. 4jHJ\iiAiiNiJ*\\*WiJ» 
LLEEDKKELELKYQNSEEKARNLKHSVDELQKRVNQS 
ENSVPPPPPPPPPLPPPPPNPIRSLMSMIRKRSHPSG 
SGAKKEKATQPETTEEVTDLKRQAVEEMMDRIKKGVH 
LR PVNQTARPKTKPb fa o KJbCii fa A V UaLi JxvaJ. uftoy 


2415 


A 


1157 


918 


RSGVPDQPGQHGEAPSLLKIQNIiAGRSGGPL*SQLLR 
RENRLNLGGGLP * AK.I Ar KJuHFC i rAW V l Utcuo v o xus. 
KILFP 


2416 


A 


70 


222 


MFCSFPLLILQVYPTWKNPNWHLTFHTSVFSFPKGVR 
SLARGI PDHLHS A* 1 


2417 


A 


163 


01 
bil 


MnfWMW&fiT .T ,C PHT iP WT iOGRACRPCGLLiASDAAALWF 
RGGISAWEDSCAVSNIRHEAYNCHLSVFLNRCANELT 
VQFLIILAFQIMLSCAVIAPAVPVFQRLTLKRSGRTS 
LGSTGRLHFCK* 


2418 


A 


60 


266 


MKRLRFVLRVFQMTAFITGAHTITNYSDRRLYI SPLS 
HFFMNSGSSAQSVLSHSYVSQIFFKNVSKYF* 


2419 


A 


218 


1885 


QSDLSTRTQLARLLFCAKTGELVGTMKIFCSRANPTT 
GSVEWLEEDEHYDYHQEIARSSYADMLHDKDRNVKYY 
QGIRAAVSRVKDRGQKALVLDIGTGTGLIiSMMAVTAG 
ADFCYAIEVFKPMADAAVKIVEKNGFSDKI KVINKHS 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/-possible nucleotide deletion,=possible nucleotide 
insertion) 










TEVTVGPEGDMPCRANILVTELFDTELIGEGALPSYE 
HAHRHLVEENCEAVPHRATVYAQLVESGRMWSWNKLF 
PIHVQTSLGEQVIVPPVDVESCPGAPSVCDIQLNQVS 
PADFTVLSDVLPMFSIDFSKQVSSSAACHSRRFEPLT 
SGRAQWLSWWDIEMDPEGKIKCTMAPFWAHSDPEEM 
QWRDHWMQCVYFLPQEEPWQGSALYLVAHHDDYCW 
YSLQRTS PEKNERVRQMRPVCDCQAHLLWNRPRFGEI 
NDQDRTDRYVQALRTVLKPDSVCLCVSDGSLLSVLAH 
HLGVEQVFTVESSAASHKLLRKIFKANHLEDKINI IE 
KRPELLTNEDLQGRKVSLIiLGEPFFTTSLLPWHNIjYF 
WYVRTAVDQHLGPGAMVMPYAASLHAVWEFKDLWRI 
R 


2420 


A 


2121 


1148 


HYLGSLELGQCGQLSPLPCGLQVALYKSVPTRLLSRA 
WGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVEDL 
HHYRNLSEFFRRKLKPQARPVCGLHSVI SPSDGRILN 
FGQVKNCEVEQVKGVTYSLESFLGPRMCTEDLPFPPA 
AS CDS FKNQLVTREGNELYHC VI YLAPGDYHCFHS PT 
DWTVSHRRHFPGSLMSVNPGMARWI KELFCHNERWL 
TGDWKHGF FS LTAVGATNVGS I R I YFDRDLHTNS PRH 
SKGSYNDFSFVTHTNREGVPMRKGEHLGEFNLGSTIV 
LIFEAPKDFNFQLKTGQKIRFGEALGSL 


2421 


A 


195 


859 


GCPGCCSPRCCLAGAHSDGPGPGSSCSSRGRQVSGNR 
AWTGPSSQARRS PGLRGQGRLAGARPPSWPE/ EDSRV 
PGKDKL * GKELE I S A* SQ P PS ARPPSGCTAPGANRNS 
WTNSSERILRAHF/APLPPSPPPPLEAGG/IiPP*GAT 
RGPSAVPSFPSVSGDWGGPVEAGRAGSRAEGEPGRAL 
APSLLCSLPPRFAGSQALGLPWAVTAERWQELRASEL 
RNR 


2422 


A 


87 


594 


KCLRKSDEALNRVLQQI\RVPPKMKRGTSLHSRRGKP 
EAPKGSPQINRKSGQEMTAVMQSGRPRSSSTTDAPTG 
SAMMEIACAAAAAAAACLPGEEGTAERIERLEVSSIiA 
QTSSAVASSTDGSIHTDSVDGTPDPQRTKAAIAHLQQ 
KI LKLTEQI KI AQTARRNRRPG 


2423 


A 


2230 


990 


NS SG VKLLQALGLS PGNGKDHS I LHSRNDLEE AF I HF 
MGKGAAAERF FSDKETFHDI AQVAS E FPGAQHYVGGN 
AALIGQKFAANSDLKVLLCGPVGPKLHELLDDNVFVP 
PESLQEVDEFHLILEYQAGEEWGQLKAPHANRFI FSH 
DLSNGAMNMLE VFVS S LEE FQ PDLGGLSGLHMMEGQS 
KELQRKRLLEWTSI SDI PTGI PV\HLELG\SMTNRE 

L»MS S I v\ lqqvfpavts lglneqellfltqs as gphs 

S LS S WNGVPDVGMVSD I LFWI LKEHGRS KS RASDLTR 
IHFHTLVYHILATVDGHWANQIiAAVAAGARVAGTQAC 
ATETIDTSRVSLRAPQEFMTSHSEAGSRIVLNPNKPV 
VEWHREGI S FHFTPVLVCKDPIRTVGLGDAI SAEGLF 
YSEVHPHY 


2424 


A 


122 


505 


ML WE L VLLGE PL WMAPS PS E S S ET VLALVNC I S PLK 
YFSDFRPYFTIHDSEFKEYTTRTQAPPSVILGVTNPF 
FAKTLQHWPHI IRIGDLKPTGEI PKQVKVKKLKNLKT 
LDS KPGVYTS YKP YSN * 


2425 


A 


2 


271 


GSVALHVEKLPNEPNRLLILHGFLDENVHFFHTNFLV 
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TABLE 7 



SEQ i 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 

/ -xrxmSKlA miplonfSHo H f*I a tin n snnccitilp nucleotide 

' = DOSSlDie nucieuuut ueieiiuiij~~|iu33iuic uuvkuuuv 
insertion) 










&r\r tds HVDVfYT .OVA T . P P\/c; PD T Y PNERHS I RC PE SG 
EHYEVTLIjHFLQEYL 


2426 


A 


2 


271 


GSVAIiHVEKLPNEPNRLLILHGFLDENVHFFHTNFLV 
SQLIRAGKPYQLQVALPPVSPQIYPNERHSIRCPESG 
EHYEVTLLHFLQEYL 


2427 


A 


2 


271 


GSVALHVEKLPNEPNRLLILHGFLDENVHFFHTNFLV 
SQLIRAGKPYQLQVALPPVSPQIYPNERHSIRCPESG 
EHYEVTLLHFLQEYL 


2428 


A 


245 


392 


GPGCI PAALLQPPKDDKKKKDAGKS AKKDKDPVNKSG 
GKAKKKVEIRPL 


2429 


A 


138 


1671 


EAVQVLI KHSADVNARDKNWQTPLHVAAANKAVKCAE 
VI I PLLSSVNVSDRGGRTALHHAALNGHVEMVNLLLA 
KGANINAFDKKDRRALHWAAYMGHLDWALLINHGAE 
VTCKDKKGYTPLHAAASNGQINWKHLLNLGVEIDEI 
NVYGNTALHI ACYNGQDAWNELI U x QjAju v ^jniminvj 
FTPLHFAAASTHGALCLELLVNNGADVNIQSKDGKSP 
LHMTAVHGRFTRSQTLI QNGGE IDC VDKDGNT PLHVA 
ARYGHELLINTLITSGADTAKCGI HSMFPLHLAALNA 
HSDCCRKLLSSGQKYSIVSLFSNEHVLSAGFEIDTPD 
KFGRTCLHAAAAGGNVECI KLLQSSGADFHKKDKCGR 
TPLHYAAANCHFHCI ETLVTTGANVNETDDWGRTALH 
YAAASDMDRNKTILGNAHDNSEELERARELKEKEATL 
CLE FLLQNDANPS I RDKEGYNS I HYAAAYGHRQ CLEL 
LLERTNSGFEESDSGATKSPLHLAVSEMP 


2430 


A 


1266 


210 


PWAVSQLASGG\ATI PGIRvaAvaKoKlri'iaXU vr a oo 
G/P/SSQYNFIADWEKTAPAWYIEILDRHPFLGRE 
VPI SNGSGF WAADGLI VTNAHWADRRRVRVRLLSG 

nmTmlttnTm^ t rr^ T1X T7\ T\ T 7A TT p TfYFTf VPTiPTTiPLiGRSAD 

DTYEIAVVTAVDPVi\lJ±i\l J_iK.Ly 1 Wirur x jjcjj^*\.4jx-i4^ 

VRQGE F WAMGS P FALQNT I T SGI VS S AQRP ARDLGL 
PQTNVEYIQTDAAIDFGNSGGPLVNLDGEVIGVNTMK 
VTAGISFAIPSDRLREFLHRGEKKNSSSGISGSQRRY 
IGVMMLTLSPSILAELQLREPSFPDVQHGVLIHKVIL 
nnriAuonPT d dp nVT T . A THROMVONAEDVYEAVRTOS 
QLAVQIRRGRETLTLYVTPEVTE 


2431 


A 


80 


403 


MLWFSGVGALAERYCRRSPGITCCVLLLLNCSGVPMS 
▼ -Annnr nvoiriv irr , T7Kn7PT7'\7T.nT PPTTDNPCIMCVCLN 
KEVTCKREKCPVLSRDCALAI KQRGACCEQ CKGC 


2432 


A 


469 


1020 


GISGKAGGSMRSGSVCSGAAAMPIEEPALRSWQRPFL 
KWAGGKYSLLPELDRLI PAGKRLIEPFVGGGSVFLNS 
DKHERFLLADVSADLINLYQMLAWPDSVIYEAMKAF 
PJOiNDAENYTLIREAFNAQRLDAVERAAAFLYLNRHC 
FNGLIRYNLDGFFQQGH* ER* RQVFPRQSWQRTDS 


2433 


A 


1 


266 


GHFRVPALGYLDVRI VDTDYS S FAVLYI YKELEGALS 
TMVQLYSRTQDVS PQALKAFQDFYPTLGLPEDMMVML 
PQSNACNPESKEAP 


2434 


A 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRS 
VRFCSSAPFPKHKPSAKLSVRDALGAQNASGERIKIQ 
GWIRSVRSQKEVLFLHVNDGSSLESLQWADSGLDSR 
ELTFGS S VE VQGQLI KS PSKRQNVELKAEKI KVIGNC 
DAKDFPI KYKERHPLE YLRQYPHFRCRTNVLGS ILRI 
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LE7 


SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










RSEATAAIHSFFKDSGFVHIHTPIITSNDSEGAGELF 
QLEPSGKLKVPEENFFNVPAFLTVSGQLHLEVMSGAF 
TQVFTFGPTFRAEN SQSRRHLAE F YMI EAE I S FVDSL 
QDLMQVIEE LFKATTMMVLS KC PEDVELCHKFI APGQ 
KDRL* HMLKNNFLI I S YTEAVE I LKQASQNFTFTPEW 
GADLRTEHEKYLVKHCGNI PVFVINYPLTLKPFYMRD 
NEDGPQELEGSVA* HSLGLMI LLS I WIGQP i 


2435 


A 


58 


501 


GNKAFVCFYLSQLENYGMPFSRTEDGKIYQRAFGGQS 
LKFGKGRQAHRCCCVADRTGHS I LHTS YGRSLRYDTS 
YFVEYFALDLLMENRECRGVI AQCNEDGS IHHIRAKN 
TWATG* ESNFYF I SFVKMNKFLLECLYFKENRGI VE 


2436 


A 


3 


717 


DSLDNHRCRGDLTKTYSLEAYDNWFNCLSMLVATEVC 
RWKKKHRTRMLEFFIDVAREC FNIGNFNSMMAI I SG 
MNLSPVARLKKTWSKVKTAKFDVLEHHMDPSSNFCNY 
RTALQGATQRSQMANSSREKIVIPVFNLFVKDIYFLP 
QNP\SNHLPNGHINFKKFWEISRQIHEFMTWTQVECP 
FEKDKKIP\SYLLTAPHPTARKLSSSPSFESEGPENH 
MEKDSWKTLRTTLLNRA 


2437 


A 


130 


726 


ITCCGYDALSSIRKNLCCLWICSKPYSLIiMGEGDAFW 
APSVLPHSTLSTLSHHPQPQFGRGMESKVSQGGLNVT 
LT I RLLMHGKE VGS I IGKKGE I TVKKMREESGARINI S 
EGNCPERI VTITGPTDAI FKAFAMIAYKFEEDI INSM 
SNS PATSKPPVTLRLWPASQCGSLIGKGGSKI KEIR 
EVTGPSQPGPLRSL 


2438 


A 


401 


249 


DTLIYTCAPEFDFMEKATPLRYTKTLLLPWMVITCF 
IFKKTVRDISCVIiA 


2439 


A 


1671 


429 


TGGRVGGSRSRRALPLPAPVEAGVLTSAGPSGWWQR 
IEDTTKMAAVSGLVRRPLREVSGLLKRRFHWTAPAAV 
QV\TVRDAINQGMDEELERDEKVFLLGEEVAQ\YDGA 
YKVSRGLWKKYGDKRI I \DTPI SEMGFAWELLVGAAI 
GWGLRPILLNLWTFNFSM\QAI\DQVINSAAKTYYM\ 
SG\GLQPVLIVSWGPN\GASAGVAAQHSQCFAAWYGH 
CPGLKWSP\WTS*DAKGLIKSAIRDNNPWALENEL 
MYGVPF\EFPPEAQSKDF\LIPIGKAKIEMHGTHITV 
VSHSRPVG\HCLRSLPAS/VLSKEGVEC\EVINMRT\ 
IRP\MDMET\IEA\SVMKTKFIL*LWEGGWPQFG\VG 
A\ E I CARI M \ EGPAFNF\ LDAPAVRVTGADVPMP YAK 
I LEDNS I PQVKDI I FAI KKTLNI 


2440 


A 


66 


1349 


APNSESGTQGPLPTPANLFWTRRANPDPTTSMSATDR 
MGPKAVPGLRLALLLLLGLGTPKSGVQGQEGLDFPEY 
DGVDRVINVNAKNYKNVFKKYEVLALLYHEPPEDDKA 
SQRQFEMEEL I LELAAQVLEDKGVGFGLi VDb bJUJAf- V 
AKKLGLTEVDSMYVFKGDEVI EYDGEFSADTI VEFLL 
DVLEDPVELIEGERELQAFENIEDEIKLIGYFKSKDS 
EHYKAFEDAAEEFHPYIPFFATFDSKGAKKLTLKLNE 
IDFYEAFMEEPVTIPDKPNSEEEIVNFVEEHRRSTLR 
KLKPESMYETWEDDMDGIHIVAFAEEADPDGFEFLET 
LKAVAQDNTENPDLSIIWIDPDDFPLLVPYWEKTFDI 
DLSAPQIGWNVTDADRLWMEMDDEEDLPSAEELEDW 
LEDVLEGEINTEDDDDDDDD 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possibIe nucleotide 
insertion) 


2441 


A 


1002 


2209 


VYPYNPLAFFRERLQPNCPFHSSEGSSGKLS*PPSPS 
FTSSLCDSRTSGFGASSTTH* HS * I RATLI SS AFTLA 
VAWAALLCPPISSCSSETWLSLRQMGSEPKVQPSCCE 
AS PSS VHLP PLPSWAVS VQAS PGSS PSMGPRGS S VS P 
PLAGGEAGLPTSGNPPNSSPWASGQGGWASLSLTSLS 
SQLSGWMAAA*LGSPSSSSSFSWGTWLSPFVSSSITG 
AESTGTSTDAVSNFLSAFKEPEAVMGSGSSWAGSSSS 
RVP PNS S S DEHVPGS P AVS S VATGFTTGS STIiE 1 1 TC 
SVPSGGGLGPGRERLSPLANELGTSGCFSSSDSWNTS 
LLRVSLPGT PGRMAEALLAGLAWFDPVGGFRS VKLDT 
LSLGKAMLSSNKLCFFKIAASFITFRVSSSRI | 


2442 


A 


1 


933 


MGSRLLCWVLLCLLGAGPVKAGVTQTPKHLITATGQQ 
VTLRCSPRSGDLSVSWYQQSLDQGLQFLIQYYNGEER 
AKGNILERFSAQQFPDLHSELNLSSLELGDSALYFCA 
SSVKVGTGELFFGEGSRLTVLEDLiKNVr PFH.WW1?EjF 
S EAE I SHTQ KATLVCLATGF Y PDHVELS WWVNGKEVH 
SGVSTDPQPLKEQPALNDSRYCLSSRLRVSATFWQNP 
RNHFRCQVQFYGLSENDEWTQDRAKFV J.y± vbAifiAWb 
RADCGFTSES YQQGVLSATI LYEI LLGKATLYAVLVS 
ALVLMAMVKRKDSRG 


2443 


A 


368 


18 


SRTPENYLKSSIDSAHRQKRKRTIPSAKGTFPGFFRA 
AKLLCQSLSPFMTGRAP*ALAGDTSAFMALIjPRTHLS 
AT PAVCP F PETF I S S VFVASL FT I LELKYHLLREAFP 
LLPS*N 


2444 


A 


5 


235 


DSSRMSYQQQQCKQPCQPPPVCPTPKCPEPCPPPKCP 
EPCPPPKCPQPCPPQQCQQKYPPVTPSPPCQSKYPPK 

SK 


2445 


A 


82 


2929 


TRTKRRLGREKAMAS PPRGWGCGELLLPFMLLGTLCE 

PGSGQIRYSMPEELDKGSFVGNIAKDLGLEPQEIxAER 

GVRIVSRGRTQLFALNPRSGSLVTAGRIDREELCAQS 

PLC WNFN I LVENKMKI YGVE VE 1 1 D INDNF PRFRDE 

ELKVKVNENAAAGTRLVLP FARDADVGVNS LRS YQLS 

SNLHFSLDVVSGTDGQKYPELVLEQPLDREKETVHDL 

LLTALDGGDPVLSGTTHI RVTVLDANDNAPLFT PSEY 

SVSVPENIPVGTRLLMLTATDPDEGINGKLTYSFRNE 

EEKISETFQLDSNLGEISTLQSLDYEESRFYIiMEWA 

QDGGALVAS AKVWTVQDVNDNAPEVI LTS LTS S I SE 

DCLPGTVIALFSVHDGDSGENGEIACSIPRNLPFKLE 

KSVDNYYHLLTTRDLDREETSDYNITLTVMDHGTPPL 

STESHIPIiKVADVNDNPPNFPQASYSTSVTENNPRGV 

qtt5 o\7T» AunDnQanM ap VTYSTiAEDTFOGAPLS S YVS 

INSDTGVIiYALRSFDYEQLRDLQLWVTASDSGNPPLS 

SNVSLSLFVLDQNDNTPEILYPALPTDGSTGVELAPR 

SAEPGYLVTKWAVDKDSGQNAWLSYRLLKASEPGLF 

AVGLHTGE VRTARAliLDRDAL KQSLWAVEDHGQPPL 

S ATFTVTVAVADRI PD I LADLGS I KT PI DP EDLDLTL 

YLWAVAAVSCVFLAFVIVLIiVLRLRRWHKSRLLQAE 

GSRLAGVPASHFVGVDGVRAFIjQTYSHEVSLTADSRK 

SHLIFPQPNYADTLLSEESCEKSEPLIiMSDKVDANKE 

ERRVQQAPPNTDWRFSQAQRPGTSGSQNGDDTGTWPN 
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678 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/-possible nucleotide deletion,=possib!e nucleotide 
uiseriiuuj 










NQFDTEMLQAMI LASASEAADGSSTLGGGAGTMGLSA 
RYGPQFTLQHVLQGBLGSDYRQNVYI PGSNATLTNAA 
GKRDGKAPAGGNGNKKKSGKKEKK 


2446 


A 


61 


241 


ANLGPTAPPRSGPVLGAGEKGRGEMRRAPFFIiSAGGL 
ETPPPSAALLWAPGRRADEISGL 


2447 


A 


1 


306 


nn n n c n nn n nn u n nnn nn nn n c n n nnnn n nnnnnnnn 
GSCTTCRCYRVGCCSSCCPCCRGCCGGCCSTPVICCC 


2448 


A 


3 


761 


YAKLGTRDPSKLCRHSLKCLECNEVFQDETSLATHFQ 

GIRKVYACSHCPDSRRTFTKRLMLEKHVQIjMHGIKDP 

QASPRSNHSTTEKAENQ\FFKVHKCAVCGFTTENLLQ 
FHEHIPQHKSDGSSYQCRECGLCYTSHVSLSRHLFIV 
HKLKEPQPVSKQNGAGEDNQQENKPSHEGGI P 


2449 


A 


2740 


2525 


MIETWLWLLLLNVGGTGQWSGPTFRRENVLPAAHIGP 
KYGPLLPSTAKGTVKVSCPSSTPHPPLQGKGTPD* 


2450 


A 


c c c 
ODD 


CI *3 


MQT.TiT.PPT.RTiTiTJiTiA ATiVA P AT AATAYR PDWNRLSGL 
TRARVETCGG * 


2451 


A 


42 


266 


KLILLKIQYFNLLMKCCFRIKGKLEEQRPERVKPFMT 
naapnTirp'TT.aMPinjvnTTKrTT.QTWTTrnT . VMVMP KM 

(tMAMiU j_ jtvrl J. i imim j? i\jn iy vim liioivii r\.uu 


2452 


A 


6 


664 


LPGRPTRAPTRPAEHS I VGTRLVSCQLQPSQPNADQG 
ICLTTMRIAVICFCLLGITCAIPVKQADSGSSEEKQLY 
NKYPDAVATWLNPDPSQKQNLIiAPQTLPSKSNESHDH 
MDDMDDEDDDDHVDSQDSIDSNDSDDVDDT\DDSHQS 
DESHHSDES\D\ELVTDFPTDLPATEVFTPWPTVDT 
YDGRGDSWYGLRSKSKKFRRPDIQYPDATDEDITS 


2453 


A 


68 


348 


IQGMHFAAGRLSTKTFCTGHGSPVDICTAKPRDIPMN 
PMGI YRSPEKKATEDEGSEQKI PE ATNRRDVE PTKAN 
SRFATTFYQHLADSKNDND 


2454 


A 


5214 


352 


MAKSGGCGAGAGVGGGNGALTWVNNAAKKE E S ETANK 
NDSSKKLSVERVYQKKTQLEHILLRPDTYIGSVEPLT 
QFMWVYDEDVGMNCREVTFVPGLYKI FDE I LVNAADN 
KQRDKNMTC I KVS I D PE SN 1 1 S I WNNGKGI P WEHKY 
EKVYVPALI FGQLLTSSNYDDDEKKVTGGRNGYGAKL 
CNI FSTKFTVETACKEYKHSFKQTWMNNMMKTSEAKI 
KHFDGEDYTCITFQPDLSKFKMEKLDKDIVALMTRRA 
YDLAGSCRGVKVMFNGKKLPVNGFRSYVDLYVKDKLD 
ETGVALKVIHELANERWDVCLTLSEKGFQQISFVNSI 
ATTKGGRHVDYWDQ WGKLI E WKKKNKAGVS VKPF 
OVKNHIWVFINCLIENPTFDSOTKENMTLQPKSFGSK 
CQLSEKFFKAASNCGI VES ILNWVKFKAQTQLNKKCS 
SVKYSKIKGIPKLDDANDAGGKHSLECTLILTEGDSA 
KS LAVSGLGVI GRDR YGVF PLRGKI LNVREASH KQ IM 
ENAEINNIIKIVGLQYKKSYDDAQSLKTLRYGKIMIM 
TDQDQDGSHIKGLLINFIHHNWPSLLKHGFLEEFITP 
IVKASKNKQELSFYSIPEFDEWKKHIENQKAWKIKYY 
KGLGTSTAKEAKEYFADMERHRILFRYAGPEDDAAIT 
LAFSKiCKIDDRKEWIiTNFMEDRRQRRLHGLPEQFLYG 
TATKHLTYNDFINKELILFSNSDMERSIPSLVDGFKP 
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679 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










GQRKVL FTC F KRNDKRE VKVAQLAGS VAEMS AYHHGE 
QALMMTIVNLAQNFVGSNNINLLQPIGQFGTRL.HGGK 
DAASPRYIFTMLSTLARLLFPAVDDNLLKFLYDDNQR 
VEPEWYI PI I PMVLINGAEGIGTGWACKLPNYDAREI 
VNNVRRMLDGLD PHPMLPNY KNFKGT I QELGQNQYAV 
SGE I FWDRNTVE ITELPWTWTQVYKEQVLEPMLNG 
TDKTPALISDYKEYHTDTTVKFVVKMTEEKLAQAEAA 
GLHKVFKLQTTLTCNSMVLFDHMGCLKKYETVQDILK 
EFFDLRLSYYGLRKEWLVGMLGAEFTKLNNQARFILE 
KIQGKITI * NRS KKDLI QMLVQRG YE SD P VKAWKE AQ 
EKAAEEDETQNQHDDSSSDSGTPSGPDFNYILNMSLiW 
SLTKEKVEELI KQRDAKGREVNDLKRKS PSDLWKEDL 
AAFVEELDKVESQEREDVLAGMSGKAIKGKVGKPKVK 
KLQLEETMPSPYGRRI I PEI TAMKADASKKLLKKKKG 
DLDTAAVKVEFDEE FSGAPVEGAGEEALTPS VP INKG 
PKPKREKKEPGTRVRKTPTSSGKPSAKKVKKRNPWSD 
DESKSESDLEETEPWI PRDSLLRRAAAERPKYTFDF 
SEEEDDDADDDDDDNNDLEELKVKASPITNDGEDEFV 
PSDGLDKDEYTFSPGKSKATPEKSLHDKKSQDFGNLF 
SFPSYSQKS EDDS AKFDSNEEDS AS VFS PS FGLKQTD 
KVPSKTVAAKKGKPSSDTVPKPKRAPKQKKWEAVNS 
DSDSEFGI PKKTTTPKGKGRGAKKRKASGSENEGDYN 
PGRKTSKTTSKKPKKTSFDQDSDVDIFPSDFPTEPPS 
LPRTGRARKEVKYFAESDEEEDDVDFAMFN 


2455 


A 


2 


154 


FKIQKTRLQREGFDPRQTSDRLFFLDLKQGHYLPIiNE 
AVYTRICSGAFAL 


2456 


A 


483 


765 


FQGQRMAGEQKPSSNLLEQFI LLAKGTSGSALTALI S 
QVLEAPGVYVFGELLELANVQELAEGANAAYLQLLNL 
FAYGTYPDYIANKESLPELY 


2457 


A 


9 


422 


ESRERSGNRRGAEDRGTCGLQS PS AMLGAKPHWLPGP 
LHS PGL PLVLVLLALGAGWAQEGSE PVLLEGECL WC 
EPGRAAAGGPGGAALGEAPPGRVAFAAVRSHHHE PAG 
ETGNGTSGAIYFDQVLVNEGGGFDRAS 


2458 


A 


64 


435 


GRGVCVAAWSQRSIAGNNDYRLFHKMSNSHPLRPFTA 
VGE I DHVHI LSEHI CALLI GE E YGDVTFVGEKKRFPA 
HRVI LAARCQYFRALLYGGMRESQPEAE I PLQDTTAE 
AFTMLLXYIYTGR 


2459 


A 


126 


434 


MCTKT I P VLWGC FLL WNL YVS S SQTI YPGI KAR I TQR 
ALDYGVQAGMKMIEQMLKEKKLPDLSGSESLEFLKVD 
YVNYNFSNIKISAFSFPNTSIjAFVPGVGI 1 


2460 


A 


126 


434 


MCTKT I P VLWGC FLLWNL YVS S SQTI YPGI KAR I TQR 
ALDYGVQAGMKMIEQMLKEKKLPDLSGSESLEFLKVD 
YVNYNFSNIKISAFSFPNTSLAFVPGVGI 


2461 


A 


126 


434 


MCT KTI P VLWGCFLLWNtiYVS S SQT I YPG I KAR I TQR 
ALDYGVQAGMKMIEQMLKEKKLPDLSGSESLEFLKVD 
YVNYNFSNIKISAFSFPNTSLAFVPGVGI 


2462 


A 


3 


1057 


EEEQECRPAIKTSDIDNPSHFEKQYESSSSSTHSDRS 
SDGEQDFVSSILPGNRPNSTNIKPQLHQKSIMKKKAG 
HKANSKH^D*EQTVVDVTEQLGDCKLDSQEKDATCEL 
PLQKVNTQSSSNSTLPGRLKASENSESEYSRSEITLV 
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680 
TABLE 7 



SEQ 
ID 


Method 


.Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/-possible nucleotide deletion,=possible nucleotide 
insertion) 










GISKKSAEHFKRKFAKSNQVSRSVSSSVQVCPEVGKR 
NLIiKVLKETLI EWKTEETLRFLYGQNYAS VCLKPEAS 
LVKEELDEDDIISDPDSHFPAWRESQNSLDESLPFRG 
SGTAIKPLPSYENLKKETEKLNLRIREFYRGRYVLGE 
ETTKSQDSEEHDSTFPLIDSSSQNQIRKRIVLEKLSK 
VLPGLLVPLQITLGDIYT 


2463 


C 


135 


341 


MYIKIKPRSFGIIHNLPSKPGPLFLPHSLIGWFDFTA 
SFLYPMNCSAMHHXVRKSSSATAITKIGKTG 


2464 


A 


265 


395 


RLCDGLFPQQDPAAPAPCEETQLSLLPLQGCGLMEGK 
TMEAKT 


2465 


A 


88 


1496 


QETSKMETLSFPRYNVAEIVIHIRNKILTGADGKNLT 
KNDLYPNPKPEVLHMIYMRALQIVYGIRLEHFYMMPV 
NSEVMYPHLMEGFLPFSNLVTHLDSFLPICRVNDFET 
ADILCPKAKRTSRFIjSGIINFIHFREACRETYMEFLW 
QYKSSADKMQQLNAAHQEALMKLERLDSVPVEEQEEF 
KQLSDGIQELQQSLNQDFHQKTIVLQEGNSQKKSNIS 
EKTKRLNELKIjSWSLKEIQESLKTKIVDSPEKLKNY 
KE KMKDTVQ KLKNARQE WE KYE I YGDS VDCLP SCQL 
EVQLYQKKI QDLSDNREKLAS ILKESLNLELX}I ESDE 
SELKKLKTEENS FKRLMI VKKEKLAT AQFKINKKHED 
VKQYKRTVIEDCNKVQEKRGAVYERVTTINHEIQKIR 
LGIQQLKDAADREKLKSQEIFLNLKTALEKYHDGIEK 
AAEDSYAKIDEKTAELKRKMFKMST 


2466 


A 


194 


2287 


GMGSENSALKSYTLREPPFTLPSGLAVYPAVLQDGKF 
AS VFVYKRENEDKVNKAAKVP * * HLKTLRHPCLLRFL 
SCTVEADGIHLVTERVQPLEVALETLSSAEVCAGIYD 
ILLALI FLHDRGHLTHNNVCLSSVFVSEDGHWKLGGM 
ETVCKVSQATPE FLRSIQSI RDPAS I PPEEMS PEFTT 
LPECHGHARDAFSFGTLVESLLTILNEQVSADVLSSF 
QQTLHSTLLNPI PKWRPALCTLLSHDFFRNDFLEWN 
FLKSLTLKSEEEKTEFFKFLLDRVSCLSEELIASRLV 
PLLLNQLVFAEPVAVXKSFIjPYLLGPKKDHAQGETPC 
LLSPALFQSRVI PVLLQLFEVHEEHVRMVLLSHIEAY 
VGALSLREQLKKV\IL\PQVLLG\LRD\TSDSIVAIT 
LHSLAVLVSLIjGPEWVGGERTKIFKRTAP\SFTK\N 
TDLSLEGDPFSQPIKFPINGLSDVKNTSEDSENFPSS 
SKKSEEWPDWSGPE\EPENQTVNI\QIWP\REP\CDD 
VKSQCTTLDVEESSWDDCEPSSLDTKVNPGGGITATK 
PVTSGEQKPIPALLSLTEESMPWKSSLPQKISLVQRG 
DDADQIEPPKVSSQERPLKVPSELGLGEEFTIQVKKK 

r>Trvr»nciumkTA7'C7V nn/T DPTVDCfl a T?T.T T.DPT /R r T 1 T^M\^PTCK' 
DDVS P VMQ F S S KF AAAE I TEGEAEGWEEEGE LN WEDN 
NW 


2467 


A 


2 


868 


IAGVAVFFYRDMFVRKDRKIHKDAESAQSCTDSSGSF 
AKLNGLFDSPVKEYQQNIDSPKLIVT/ SLTSRKELPP 
NGDTKSMVMDHRGQ P PELAAL PT PE STP VLHQKTIiQ A 
MKSHSEKAHGHGASRKETPQFFPSSPPPHSPLSHGHI 
PSAIVLPNATHDYNTSFSNSNAHKAEKKLQNIDHPLT 
KSSSKRDHRRSVDSRNTLNDIiLKHLNDPNSNPKAIMG 
DIQMAHQNLMLDPMGSMSEVPPKVPNREASLYSPPST 
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681 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possible nucleotide 
insertion) j 










LPRNS PTKRVDVPTTPGVPMTSLERQRGYHK 


2468 


A 


483 


764 


MGSVFWHVLFCI SGVCLWCAHRMAAFLQQMAVLLPVD 
CER PAA VHWLALCGCC YGQLVWES RTRS C FWS LECLC 
FGGQHFGS VPS FFCS S VWL * 


2469 


A 


3 


357 


FGFNGCSKRIIKLQELSDLEERENEDSMVPLPKQSLK 
FFCALEWLPSCDCRSPGIGLVEEPMDKVEEGPLSFL 
MKRKTAQKLAIQKALSDAFQKLLIWLG/ QDCLDHP* 
STSVSVSK 


2470 


A 


3 




RIGQGVPWHS * VEGGPNVI S I VLEYLRDTPPVPVW 
CDGSGRASDILAFGHKYSEEGG*VKVFLWCTHKWKED 

PM 


2471 


A 


69 


512 


MALAFLGTVLS KATLGARLTTHCAHPARRARAFSSDV 
MTHSSILTRASLLTLWTMFTRRTKILTEGSGVSWWAA 
AF PRDWAGGS I LALASLMTWT I GALLTAVLAAPAP 
EARSTVAS PGDGVAQSPI FALAPAGAVGTPVIT I AG* 


2472 


A 


2195 


872 


VSQATDVEVGTDLVPS VTVKVTLQNRVI LQKAKLS VY 
VQPPLELTCDQFTFEFMNRNPDGI PRVIQCKFRLPLK 
LICLPGQPSKTASHKITIDTNKSPVSLLSLFPGFASQ 
SDDDQVNVMGFHFLG\GAR\ITVLASKTSSTDIRIPG 
VEQFE\DLWASLTNELILRLQEYFEKQGVKDFACSFS 
G\SITPFKEYF\ELIGSIHFELRINGEKIiEELLSERA 
VQFRAIQRRLLARFKDKTPAPLQHLGHLVRMGTYK\Q 
VI ALA\ DAVGGKTKGNLFQS FTRLKS ATHLVILL I AL 
WQKLS ADQVAILEAAFLPLQEDTQELGWEETVDAAI F 
H\L* KTCCRKSAKQQALNPPGRLTYPNDTS\QLKKHI 
TLLCDRLSKGGRLCLSTDAA/ APHQTMVMPGGCTTI P 
ESDLEERSVEQDSTELFTNHRHLTAETPRPEVSPLQG 

VSE 


2473 


A 


1 


473 


EVRWNSPPTDSLSPDGGSIELEFYLAPEPFSMPSLLG 
APPYSGLGGVGDPYAPLMVLMCRVCLEDKPIKPLPCC 
KKAVCEECLKVYLSAQIQCPTCQFVWCFKCHSPWHEG 
VNCKE YKKGDKLLRHWAS E I EHGQRNAQKC PKC KI HI 
QRTEGCDHM 


2474 


A 


131 


1098 


RVPAGGARRIjGQDP PRIiP PGVADAPAAMSTQRLRNED 
YHDYSSTDVSPEESPSEGLNNLSSPGSYQRFGQSNST 
TWFQTLIHLIjKGN I GTGLLGLPIiAVKNAGI VMGPI S L 
LI IGI VAVHCMGILVKCAHHFCRRLNKSFVDYGDTVM 
YGLESSPCSWLRNHAHWGRRWDFFIilVTQLGFCCVY 
FVFLADNFKQVIEAANGTTNNCHNNETVILTPTMDSR 
LYMLS FLPFLVLLVFIRNLRALS I FSLLANI TMLVSL 
VMIYQFIVQIL*MDLQPM*QTKVFHREQVPLCLQHVE 
SQMEQFWAECFAQRVLPINVIiSLQKK 


2475 


A 


131 


1098 


RVPAGGARRLGQDPPRLPPGVADAPAAMSTQRLRNED 
YHDYSSTDVSPEESPSEGLNNLSSPGSYQRFGQSNST 
TWFQTLIHLLKGNIGTGLLGLPLAVKNAGIVMGPISIi 
III IGI VAVHCMGI LVKCAHHFCRRLNKS FVDYGDTVM 
YGIiESSPCSWLRNHAHWGRRWDFFLIVTQLGFCCVY 
FVFLADNFKQVIEAANGTTNNCHNNETVILTPTMDSR 
LYMLSFLPFLVLLVFIRNLRALSIFSLLANITMLVSL 
VMIYQFIVQIL^MDLQPM-QTKVFHREQVPLCLQHVE 
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682 
TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *= s Stop codon, 
/=possible nucleotide de!etion,=possibIe nucleotide 
insertion) 










SQMEQFWAECFAQRVLPINVLSLQKK 


2476 


A 


505 


1373 


WGDGTQNE SHS S SVSLTAFLSDTKDRGP PVQSQ I WRS 
GEKVPFVQTYSLRAFEKPPQVQTQALRDFEKHLNDLK 
KENFSLKLLIYFLEERMQQKYEASREDIYKRNTELKV 
EVESLKRELQDKKQHLDKTWADVENLNSQNEAELRRQ 
FEERQQEMEHVYELLENKMQLLQEESRLAKNEAARMA 
ALVEAEKECNLELSEKLKGVTKNWEDVPGDQVKPDQY 
TEALAQRDK*VPSVLFL\RLSFAHSQGIQQLSCSLSR 
T/RQ* ELHYF *DFMGPQPKTFFSGLNFQWYPL 


2477 


A 


1 


317 


QRPSEAKEIKLYAQIPPIEKMDASLSMLANCEKLSLS 
TNCIEKIANLNGL\EAVGDTLEELWISYNFIEKLKGI 
HIMKKLKILYMSNNLVKDWGTPVI KGDEEEDN 


2478 


A 


2 


607 


CKNTLIRQNI PRAQFPATS PRS I IQQPN/ PFPRRFVL 
PLNVSLNAPEGDNLSPLSYTSASAVKQADGTIWCSHE 
NLHQEDLEKEGGIEFPQIYYDRFSGKKYHFFYGCGFR 
HLVGDSLI KVDWNKTLKVWREDGFYPSEPVFVPAPG 
TNEEDGGVILSWITPNQNESNFLLVLDAKNFEELGR 
AE VPVQMP YGFHGTF I PI 


2479 


A 


2 


607 


CKNTLIRQNI PRAQFPATS PRS I IQQPN / PFPRRFVL 
PLNVSLNAPEGDNLS PLS YTS ASAVKQADGTI WCSHE 
NLHQEDLE KEGGI E F PQ I YYDRFSGKKYHF F YGCGFR 
HLVGDSLI KVDWNKTLKVWREDGFYPSEPVFVPAPG 
TNEEDGGVILSWITPNQNESNFLLVLDAKNFEELGR 
AEVPVQMPYGFHGTFI PI 


2480 


A 


101 


580 


LSLTKNCALLGEETMMEQEMTRLHRRVSEVEAVLSQK 
EVELKASETQRSPLEQDLATYITECSSLKRSLEQARM 
EVSQEDDKALQLLHD I REQSRKLQE I KEQE YQ AQ VEE 
MRLMMNQLEEDLVSARRRSDLYESELRESRLAAEEFK 
RKATECQHKLLK 


2481 


A 


1 


2025 


MAWAGRGRGSRQGSELHLPWAIDVCLFSLVRSGFRFL 
RE VWWE I WKKVLLLLHVANGAQQAGP I PWNTGLQANH 
S VPVS KPHQKWPVQHFQELLRS ANS LTAPFKQVQYWR 
GTKMNQRVPVPQIHSWFRMFC04AHESHGIGKWGVAL 
EGHPPGPGKQESIANACWEAAVRSPGSRSHKAETKSS 
KSRDQILSVLRPASFVRDKSIPQPWLESDGINKRWSP 
TCLSGEPSLGRVNPLLHELQTQCFVRTPSYQRATEAA 
KPQERCTIQLNKMCCLQAGSFSRYASVIAIKHICHAH 
STPKALLTSFLVLTTTRSLNLHLHLRLSHPDKFRDGG 
VSSSQYSRYCSLTQPDFDSSNSSTFFLLLTISLLSSQ 
FCIRLISLPECPVSQWQEAAREHLGGGSDLSSMGETH 
PDLGGGPSEGPGGWPWEQVSAAFAQLVLVSTMSFQGT 
WRKRFSSTDTQILPFTCAYGLVLQVPMMHQTTEVNYG 
QFQDTAGHQVGVLELPYLGSAVSLFLVLPRDKDTPLS 
HIEPHLTASTIHLPJTTSLRRARMDVFLPSELTKEPFR 
WDQRL F AL VLRL PGTMS VE S EQLTG VPLDD S AI T PMC 
EVTGVGMECFSDAKDTIEDLSEMHGSQDLSEMRGNPT 
KPSPPLSGTTVENFGSRGTDSYEAFSEPSLGKEPVTH 
RTRVPLQWP 


2482 


A 


137 


879 


LPPRGPATFGSPGCPPANSPPSAPATPEPARAPERVM 
ANAGLQLLGFILAFLGWI GAI VSTALPQWRI YS YAGD 
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683 
TABLE 7 



SEQ 

m 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










NIVTAQAMYEGLWMSCVSQSTGQIQCKVFDSLLNLSS 
TLQATLALMWGI LLGVIAI FVATVGMKCMKCLEDDE 
VQKMRMAVT GGAI FLLAGLAI LVATAWYGNRI VQE FY 
DPMTPVNARYEFGQALFTGWAAASLCLLGGALLCCSC 
PRKTTS YPTPRP YPK\ PAPS \ SGKD YV 


2483 


A 


200 


1139 


RIISTITYQFSAALGQEVFYITFLPFTHWNIDPYLSR 
RLI 1 1 WVLVMYIGQVAKDVLKWPRPSSPPVVKLEKRL 
IAE YGMPSTHAMAATAI AFTLLI STMDRYQYPFVLGL 
VMAWFSTLVCLSRLYTGMHTVLDVLGGVLITALLIV 
LTYPAWTFIDCLDSASPLFPVCVIVVPFFLCYNYPVS 
DYYS PTRADTTTI LAAGAGVT I GFWINHFFQLVSKPA 
ESLPVIQNIPPLTTYMLVLGLTKFAVGIVLILLVRQL 
VQNLSLQVL YS WFKWTRNKEARRRLE I EVPYKFVTY 
TS VGI GTKWAQMPTDV 


2484 


A 


173 


307 


SHI CLKKSAKSLTGTWMKLETI I LSKLTQEQKTKHCM 
FSLISGS 


2485 


A 


173 


307 


SHI CLKKSAKSLTGTWMKLETI I LSKLTQEQKTKHCM 
FSLISGS 


2486 


B 


86 


225 


PRQEKKSSHVSTRRSPKLLREKPEAAAGEAAAEAGLP 
MFARSRARSR 


2487 


A 


14 


1256 


W PCGAAPGLTHAS ERMFTLTTM I QALAP VMGWDRKPL 
KMFSSEEMRGHLHHHHKCLTKILKVEGQVPDLPSCLP 
LTDNTRMLASILINMLYDDLRCDPERDHFRKICEEYI 
TGKFDPQDMDKNLNAIQTVSGILQGPFDLGNQLLGLK 
GVMEMMVALCGSERETDQLVAVEALIHASTKLSRATF 
I ITNGVSLLKQI YKTTKNEKI KIRTLVGLCKLGSAGG 
TDYGLRQFAEGSTEKLAKQCRKWLCNMS IDTRTRRWA 
VEGLAYLTLDADVKDDFVQDVPALQAMFELAKTSDKT 
ILYS VATTLVNCTNS YDVKEVI PELVQLAKFSKQHVP 
EEHPKDKKDFIDMRVKRLLKAGVI SALACMVKADSAI 
LTDQTKELLARVFLALCDNPKDRGTI VAQGGGKALI P 

LALEGTD 


2488 


B 


526 


3482 


MDS LKQETQGLQKEKE SRE KELMGFS KS VNE ARSKMD 
VAQSELDIYLSRHNTAVSQLTKAKEALIAASETLKER 
KAAIRDIEGKLPQTEQELKEKEKELQKLTQEETNFKS 
LDKMAVWAKKMTEIQTPENTPRLFDLVKVKDEKIRQA 
FYFALRDTLVADNLDQATRVAYQKDRRWRWTLQGQI 
I EQSGTMTGGGSKVMKGRMGS SLVI E I SEEEVNKMES 
QLQNDSKKAMQIQEQKVQLEERWKLRHSEREMRNTL 
EKFTAS IQRLIEQEE YLNVQVKELEANVLATAPDKKK 
QKLLEENVSAFKTEYDAVAEKAEESLPEIQKEHRNLL 
QELKVIQENEHALQKDALSIKLKLEQIDGHIAEHNSK 
IKYWHKEISKISLHPIEDNPIEEISVLSPEDLEAIKN 
PDS ITNQI ALLEARCHEMKPNLGAI AEYKKKEELYLQ 
RVAELDKITYERDS FRQAYEDLRKQRLNE FMGS VR P P 
KKS WKKI FNLS GGE KTLS S LAL VF ALHH YKPT PL Y FM 
DE I DAALD F KNVS I VAF Y I YE AVWFLSN I T AGNQQQ V 
QAVIDANLVPMI IHLLDKGDFGTQKEAAWAI SNLTI S 
GRKDQVAYLIQQNVI PPFCNLLTVKDAQ WQWLDGL 
SNILKMAEDEAETIGMLIEECGGLEKIEQLQMHENED 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X^Unknown, *=Stop codon, 
/^possible nucleotide deletion,-possible nucleotide 
insertion) 










IYKXAYEIIDQFFSSDDVFNNFWFLILYINIDLDKV 
IYIGSSLMKQTPSPGVTVYESWWLYVNACKLDSPS 
GG 


2489 


A 


1 


747 


MRLQRPRQ APAGGRRAPRGGRGS P YRPDPGRGARRLR 
RFQKGGEGAPRADPPWAPLGTMALLALLLVVALPRVW 
TDANLTARQRDPEDSQRTDEGDNRVWCHVCERENTFE 
CQN PRRCKWTE P YC VI AAVKI F PRFFMVAKQCS AGC A 
AMER PKPEE KRFLLEE PM P FF YLKCCKI RYCNL/ GGA 
/NLSTHQ\CSKNMLGAWVRAWGCGWPSSCCWPPLQP 
ASACLEPRDCHRLSLPEHGLAPDRCHLLH 


2490 


A 


2 


1177 


GFVEAGEEC YCVS \ GQECRDLC C FAHNCS LRPGAQCA 

HGDCCVRCLLKPAGALCRQAMGDCDLPE FCTGTS SHC i 

PPDVYLLDGS PCARGSGYCWDGAC PTLEQQCQQLWGP ! 

GSHPAPEACFQWNSAGDAHGNCGQDSEGHFLPCAGR 

DALCGKLQCQGGKPSLLAPHMVPVDSTVHLDGQEVTC 

RGALALPSAQLDLLGLGLVEPGTQCGPRMVCQSRRCR 

KNAFQELQRCLTACHSHGVCNSNHNCHCAPGWAPPFC 

DKPGFGGSMDSGPVQAENHDTFLLAMLLSVLLPLLPG 

AGLAWCCYRLPGAHLQRCSWGCRRDPACSGPKDGPHR 

DHPLGGVHPMELGPTATGQPWPLDPENSHEPSSHPEK 

PLPAVSPDPQADQVQMPRSCLW 


2491 


A 


1 


609 


AAARTFWYKLF PCRGSGGAAKAAEQKRQVGGRAE PGT 
AAPCGARC PGPTPGWQVPATKALLSQPMGCP P PGPCR 
GHT * ADPQLPLTHAP/ PEARLS PQQPP / PS PPGSATP 
GA* AGVAS PKPTLPAPGAPGTPQRLPGP/ RREKPAFL 
SQPESST*PHPTPVSAASSSPA/PESSCHDELGLLSL 
NIjPAPGPPKPT pgaaas fqgs g 


2492 


A 


1 


242 


MNRGGFAVKI LALLDALSTVCSQRVQKAKKQQHLQNK 
EHFKALLKQKEKLKQQEDL/RKKLF*IQGIRCPQATP 
HHGQCSL 


2493 


A 


909 


3 53 


RSFVLDTASAICNYNAHYKNHPKYWCRGYFRDYCNI I 
AFSPNSTNHVALRDTGNQLIVTMSCLTKEDTGWYWCG 
IQRDFARDDMDFTELIVTDDKGTLANDFWSGKDLSGN 
KTRSCKAPKWRKADRSRTSILIICILITGLGIISVI 
S HLTKRRRSQRNRRVGNTLKP F SRVLTPKEMAPTEQM 


2494 


A 


516 


848 


MWSLWI WVDQHQARLI PS PQVLLLLLRETPSTAAAVA 
GWLVVASMALLQLHAVGGVALTS SHPFMWATGEELRK 
P PWQGS AGS ASGVEELTGKHS C PGPE E PATVQKAPA* 


2495 


A 


349 


1018 


TFTQPDPDDLISKPPRTPGGG*YQTQWPSPPDPRRTS 
PAGRPGPARRPPRRTPRPARGRHPGR* GG PGASRPGG 
TGAAPAADQTGS PAVSTPSEFGAPGQAEGPQS PI RAS \ 
ARSHLS CTAWLGKPS KPSAQRQPTVGPDGDRDGSSQA 
PNLSRGQAWRASLASPQNTSATGRVTCHGQSTWPLCR 
LKSNRRRKSGFA/GNKSEPVGLTRRSKHQPRNPQGQV 
GI 


2496 


A 


349 


1018 


TFTQ PD PDDLI S KP PRT PGGG * YQTQWPS PPD PRRTS 
PAGRPGPARRPPRRTPRPARGRHPGR* GGPGASRPGG 
TGAAPAADQTGS PAVSTPSEFGAPGQAEGPQS P I RAS 
ARS HLS CTAWLGKPS KPSAQRQ PTVGPDGDRDGS SQA 
PNLSRGQAWRASLAS PQNTSATGRVTCHGQSTWPLCR 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,==possible nucleotide 
insertion) 










LKSNRRRKSGFA/ GNKSEPVGLTRRSKHQPRNPQGQV 
GI 


2497 


A 


349 


1018 


TFTQPDPDDLISKPPRTPGGG*YQTQWPSPPDPRRTS 
PAGRPGPARRPPRRTPRPARGRHPGR*GGPGASRPGG 
TGAAPAADQTGS PAVST PS E FGAPGQAEGPQS P I RAS 
ARSHLSCTAWLGKPSKPSAQRQPTVGPDGDRDGSSQA 
PNLSRGQAWRASLAS PQNTSATGRVTCHGQSTW PLCR 
LKSNRRRKSGFA/GNKS E PVGLTRRS KHQPRN PQGQV 
GI 


2498 


A 


2025 


422 


PPGTQGSPQRT/ GDHGGKPPLPAEKPAPGPGLPARAS 
RAEGRGASGWKPGGQ PAGGS WQGGDAGPRRPAS GDQR 
TAGAAKALAGPAGEAAGGDRGAAQGDPPAEAGGRGG* 
TQAGGGASRARGSGAQRPGGP * RQGQGDGGESAS PAF 
GPC PQS SWGPPCS I PGP * PAL PGAL * GA\ VGRDPAGP 
PDGGPDTE P / PGS PGQAERWPEGCRPQGS WHCEGAPQ 
GPGAGARARPRQGSRGPRGAPRRGI PWAKSGR\ TGGS 
QDRKKPGKEVAATGTS I / PEGSQLARGRARSRDGGPS 
HEAQASEPRPGPCSGPARWGGRSSCTAPGCVTPAGTA 
GHL * WRAGWTAGPPAGPWRSPGDEKGPRGGPCACVPR 
AAERRGGRCC PGAQAEARARAGAQTSCPGGPEAGQCQ 
AQPGPETAGWLRPPEATAGPWPSCRGSAGPEGWGHHW 
P*PPA*CPGERPPWRPGCPAPPGCGGSSAGGPQPAA* 
TGAWASRGVLAPAGHEGHASHCPPRPAAGLSQPHPSQ 
TLEVTIiAS PQGFMS EALTKCE 


2499 


A 


1415 


661 


SLRTPGFRGGGVLYWDAGAAGTGSNHALGANVELWIM 
LLQWREGKFSGFLTSCSLLLPRAAQILAAEAGLPSS 
RS FMGFAAPFTNKRKAYS ERR IMGYSMQEMYE WSNV 
QE YREFVPWGKKSLWS SRKGHLKAQLEVGFP PVMER 
YTSAVSMVKPHMVKAVCTDGKLFNHLETIWRFSPGIP 
AYPRTCTVDFS I S FEFRSLLHSQLATMFFDE WKQNV 
AAFERRAATKFGPETAI PRE LMFHE VHQT 


2500 


A 


673 


941 


CCLAAHSGPPAQGQRRGPG*LCCSAGSGGNL*S*AGG 
PG* GRSGQ PVC P PWPGPGAPGHRPALPGSGGS S AVGR 
SAVPGAVRSPSHAGW ! 


2501 


A 


328 


1212 


RQEQGHFHF FCGGMS S FKAGTSHLDVYMQ VTEGRED Y 
NPSMHLAKRQFLSLEEEAEDYNPSQHRAQGNWLQDYN 
ASMQRVHGQCVSLEEDVELCVPRWACREMQSHNYPSR 
LVAGLQQYNFS I SLAQGECTSHWRKRGIMTYSS IHCL 
GDVTLHSYLGPSKTEDCDISVTLPPRLERRITLPKHW 
IKKYFTIFLMGKAQINKIDRPLVRQIKEKREKNQRDA 
IKNDKGU1 1 1 Jviriliiyi i IKiSx Xlsjliji^Wft.vciJNij.ciCij. 
DKFLETSTPPRLNEEEVESLSRPIAGSEIEAIINSL 


2502 


B 


1 


1428 


MGSRVRLS KRRAKAGVQSGTNALL WKHRDMNE KELE 
AQEARKAQLENHEPEEEEEEEIRQPRKKLGAQPWHW 
VAPDGRLLGNS S RTRVRGDGTLDVTI TTLRDSGTFTC 
I ASNAAGEATAPVE PRGLC PDYACTRFSTTVPLMTPS 
STGVDIEAARKEEERIMLRDARQWLNSGHINDVRHAK 
SGGTALHVAAAKGYTE VLKI I SLRFGVPRTQVRTWVA 
LYEKHGEKGLI PKPKGVSADPELRI KWKAVIEQHMS 
LNQAAAHFMLAGSGSVARWLKVYEERGEAGLRALKIG 
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SEQ 
ED 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X^Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possibIe nucleotide 
insertion) 










TKRNIAI S VDPEKAAS ALELS KDRRI EDLERQVRFLE 
TRLMYLKKLKALAHPTKKAAEIPRSTFYYHLKALSKP 
DKYADVKKRI S E I YHENRGR YGYRRVTLS LHREGKQI 
NHKAVQRLMGTLSHKAAI KVKRYRS YRGEMKKLRIRE 
VQI LAGGHTAKLNMEQVKSADAFTYI kqpi A 


2503 


A 


218 


415 


MRCRAPAWLRRLCGQLLS ERLMRPNGVQAWRGI LEG 
AGAGAAGGS DAE VTAADWKKCDLI AKI LA 


2504 


A 


3 


136 


S WATAGAANGPAPLGVRAP PAWRTS PAAEMGATGAAE 
PLQS VLWVKQQRCAVSLE PARALLRWWRS PGPGAGAP 
GADACSVPVSEI I AVEETDVHGKHQGSGKWQKMEKPY 
AFTVHCVKRARRHRWKWAQVTFWCPEEQLCHLWLQTL 
REMLEKLTSRPKHLLVFINPFGGKGQGKRIYERKVAP 
LFTLASITTDIIVTEHANQAKETLYEINIDKYDG*VR 
RPSASARPQPGGRARRRRWGRRGRRSRCNPCCG 


2505 


A 


335 


1105 


MKRERGALSRASRALRLAPFVYLLLIQTDPLEGVNIT 
SPVRLIHGTVGKSALLSVQYS STSSDRPWKWQLKRD 
KPVT WQS I GTEVI GTLRPDYRDRI RLFENGSLLLSD 
LQLADEGTYEVEISITDDTFTGEKTINLTVDVPISRP 
QVLGASTTVLELSEAFTLNCSHENGTKPSYTWLKDGK 
PLLNDSRMLLS PDQKVLT I TR VLMEDDDL YS C WENP 
INQGRTLPCKITEYRKSSLSSIWLQEAFSSLGPW* 


2506 


A 


335 


1105 


MKRERGALSRASRALRLAPFVYLLLIQTDPLEGVNIT 
SPVRLIHGTVGKSALLSVQYS STSSDRPWKWQLKRD 
KPVTWQS I GTEVI GTLRPDYRDRI RLFENGS LLLSD 
LQLADEGTYEVEISITDDTFTGEKTINLTVDVPISRP 
QVLGASTTVLELSEAFTLNCSHENGTKPSYTWLKDGK 
PLLNDSRMLLS PDQKVLTI TRVLMEDDDL YSC WENP 
INQGRTLPCKITEYRKSSLSSIWLQEAFSSLGPW* 


2507 


A 


1160 


"3149 


VS KTTTTNAGNALF PMPGS SKTKKPNSHQRGQMGS * G 
RNPPSLGRAPAPLPEREAPI PAPQLGPSAAGTSRQVG 
QKSSTS PHQGEEAILNRELKKKDGKKK* KK/ PTGLSK 
HQPAGF I QNE * NLKGAGEFVQGLAGSQNPPS S KLQGL 
GG\SAESRGFSRGQGQTAPHWESTPLKGALPPCPERG 
MLPEEG*GFSGKEASSGPVQPQPTCLYGIRPSLGS * P 
*GQRRTLLAPTFLQENQL\SGPSPGQRARSVLRPFSA 
/ PGLRPELELTGGRGSTRSRRAAGPWASDCTAGSDQE 
SLGRSSGKGR*GASGTVLGVSMCKV/ PGCKAAGGHLP 
GGGRGLDLECGWGLRSWLPGRGRQ/TGPPG/ PQGRDS 
*STKQSDSHRWQDSGGGLAPPPPGQGNNGARPCC*DV 
TKAS APGVSGDTGRE APS ATGI STFRSC CMS S ARGLG 
QSPAAPVLASSFLPTSCTGPPGLPGLPSSGSEENIHS 
GAWALVGQEGPSMDGRGNGMMLRGVWTGVHGGGMD\G 
CGAE VI * RGKFLME * YRSGLQRKQDS S PARTPAPQWL 
S ITTGS *TPE /GDPGGKLDAAQRGRAI AAH/GTAGGC 
CPRCCCHL* SPGSARSS P/ PMASAS IRVS \ PPRSGGS 
PPS PS S A* KS DRTDAGAGVAAAAS PGAGAPAHCPQGP 
PRSCQGPQRR 


2508 


A 


1 


957 


METSSPRPPRPSSNPGLSLDARLGVDTHLWAKVLFTA 
LYALI WALGAAGNALS VHWLKARAGRAGRLRHHVLS 
LALAGLLLLLVGVPVELYS FVWFHYPWVFRDLGCRGY 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletio n,=possibIe nucleotide 
insertion) 










YFVHELCAYATVLSVAGLSAERCLiAVCQPLRARSLLT 
PRRTRWLVALS WAAS LGLALPMAVI MGQKHELETADG 
E PE PASRVCTVLVSRTALQVFIQE AI WMYVI CWLPY 
HARRLMYCYVPDDAWTD PLYNFYHYF YMVTNTL PYVS 
S AVTPLLYNAVS S S FRKLFLE AVS S LCGEHHPMKRLP 
PKPQS PTLMDT ASGFGDPPETRT 


2509 


A 


144 


291 


DVEVKWI E YQNMVNYLIQWI RHHVTTMSERTF PNNPV 
ELKVSVTVEIT 


2510 


A 


144 


291 


DVEVKWI EYQNMVNYLIQWIRHHVTTMSERTF PNNPV 
ELKVSVTVEIT 


2511 


A 


3 


279 


RSL P PAHS VSL WS VKDGLRPWH PELRS VQ PTRGGRTQ 
THRRGAAPGI STPHTLGGRASAARRPWHTCGRQRRPP 
RRRRERRPLYS SVLRST 


2512 


A 


3 


1396 


RQENNTRGVPSLLKSFLQERLGIHLIRRKIVKPKHHV 
LMSRKESWKVKSEI PKVPKQPLVLHHPRMTTTKS PSK 
DMLEPEAELAEDLPTTKSTSVES / EDAH* EPGRPFPV 
LPDL/PCHCLPSAPTPLCIVKRPCPT*VTQLSASAQS 
AHQMRTPRAQSPSS*PR*VNCLPPS/LHKDDLELKEK 
DQKKPPTAPREVKGTRRKLPTAFLPSKYHGYEELLTA 
KPD PAFI E PKGI QKNA/ PS PATNAEAPTPVPLLQAQA 
GHSSETLCSQRETGPENPDSTPKED*SPTSG*HLHSL 
AGSPEHYRGSTRCCPAPVDRTAAGEP/ ASSTWRPRGC 
*RSSRHVTGSW*VALCAQCSGLPRSPWPAQR*VRASP 
S SATS S S S WMS S ARS PQP VTHKARAVHGGC VHHPACA 
PALPEGSVPWTAPQG* PAGHRPQSSAGPHLLATRWHP 
LVRI S PPWPRHDLVPGPAAI KSGCTGQ 


2513 


A 


3 


1396 


RQENNTRGVPSLLKSFLQERLGIHLIRRKIVKPKHHV 
LMSRKESWKVKSEI PKVPKQPLVLHHPRMTTTKS PSK 
DMLEPEAELAEDLPTTKSTSVES /EDAH* EPGRPFPV 
LPDL/ PCHCLPSAPTPLCI VKRPCPT* VTQLSASAQS ' 
AHQMRTPRAQSPSS*PR*VNCLPPS/LHKDDLELKEK 
DQKKPPTAPREVKGTRRKLPTAFLPSKYHGYEELLTA 
KPDPAF IE PKGI QKNA/ PS PATNAEAPTPVPLLQAQA 
GHSSETLCSQRETGPENPDSTPKED*SPTSG*HLHSL 
AGSPEHYRGSTRCCPAPVDRTAAGEP/ ASSTWRPRGC 
*RSSRHVTGSW*VALCAQCSGLPRSPWPAQR*VRASP 
S S ATS S SSWMS S ARS PQPVTHKARAVHGGCVHHPACA 
PALPEGSVPWTAPQG* PAGHRPQS SAGPHLLATRWHP 
LVRI S P PWPRHDLVPGPAAI KSGCTGQ 


2514 


A 


1065 


478 


HGLCELTSTVQEGELCVFFRNNHFSTMTKYKGQLYLL 
VTDQGFLTEEKVVWESLHNVUvjLKjjWr L-Ut>c.r riJ_iKJrJro 
DPETVYKGQQDQIDQDYLMALSLQQEQQSQEINWEQI 
PEGI SDLELAKKLQEEEDRRASQ YYQEQEQAAAAAAA 
ASTQAQQGQPAQASPSSGRQSGNSERKRKEPREKDKE 
KEKEKNSCVIL 


2515 


A 


1065 


478 


HGLCELTSTVQEGELCVFFRNNHFSTMTKYKGQLYLL 
VTDQGFLTEEKWWESLHNVDGDGNFCDSEFHLRPPS 
DPETVYKGQQDQIDQDYLMALSLQQEQQSQEINWEQI 
PEGI SDLELAKKLQEEEDRRASQYYQEQEQAAAAAAA 
ASTQAQQGQPAQASPSSGRQSGNSERKRKEPREKDKE 
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SEQ | 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (XKJnknown, *=Stop codon, 
/=possible nucleotide deletion, ^possible nucleotide 
insertion) 










KEKEKNSCVIL 


2516 


A 


290 


1041 


KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRANL 
GPCRRKRLQTLMRLAAGFQYSSHKDPSIiSAKEKHTDY 
HNEARGPWPGWVG* RTADGSCGRGPDGAHHPGPKS S S 
WRASRLLPGLGGSHHLDAYVGRDLECGTPAPLQLEI P 
PQ PRGHPAP I PTGQAGPRDSG PGAS P * VETRPLTDGR 
R * PGVRPVGWT PAHPAGTLR PRGAVE PS VS ACGKWAP 
SPTSQGCCEGRCDAVPKHRAWRTPLCSQ 


2517 


A 


2 


1736 


QNENSVDKWGKPLVIDKLKEMAKVEGLWNLFLPAVSG 
LSHVDYALI AEETGKC F FAPDVFNCQAPDTGNMEVLH 
LYGSEEQKKQWLEPLLQGNITSCFCMTEPDVASSDAT 
NIECSIQRDEDSYVINGKKWWSSGAGNPKCKIAIVLG 
RTQNTSLSR*LNNSD*ETCVGMSQSSSYLGNLLKIHC 
LDSQIIM*DMRVNVIYLYFTSIF*QVFLENIIGSIAE 
HSSLWNFQY*KVLLNYQSCLD*IIRQIFSDLCNEVIR 
CLDQRQ*S*NV*IiYI*VPSYHC*AVRSFNQTTHLFSN 
HCFCSRSQPASDYVGVRLLHSSHSSHHCLHDYMKTSK 
RQLGFCLLSVLFFFLANFF*YNFSFD*\HKQHSMILV 
PMNTPGVKI IRPLSVFGYTDNFHGGHFEIHFNQVRVP 
ATNLILGEGRGFEI SQGRLGPGRIHHCMRTVGLAERA 
LQIMCERATQRIAFKKKLYAHEWAHWIAESRIAIEK 
I RLLTLKAAHSMDTLGS AGAKKE I AMI KVAAPRAVS K 
I VDWAI QVCGGAGVSQDYPLANMYAITRVLRLADGPD 
EVHLSAIATMELRDQAKRLTAKI 


2518 


A 


2 


1736 


QNENSVDKWGKPLVIDKLKEMAKVEGLWNLFLPAVSG 
LSHVDYALIAEETGKCFFAPDVFNCQAPDTGNMEVLH 
L YGSEEQKKQWLE PLLQGNI TSCFCMTE PD VAS SDAT 
NIECSIQRDEDSYVINGKKWWSSGAGNPKCKIAIVLG 
RTQNTS LS R * LNNSD * ETC VGMSQS S S YLGNLLKI HC 
LDSQI IM*DMRVNVI YLYFTS I F *QVFLENI IGS I AE 
HSSLWNFQY* KVLLNYQS CLD * IIRQI FSDLCNEVIR 
CLDQRQ* S *NV*LYI * VPSYHC* AVRSFNQTTHLFSN 
HCFCSRSQPASDYVGVRLLHSSHSSHHCLHDYMKTSK 
RQLGFCLLSVLFFFLANFF* YNFSFD* \HKQHSMILV 
PMNTPGVKI IRPLS VFGYTDNFHGGHFE IHFNQVRVP 
ATNLILGEGRGFEI SQGRLGPGRIHHCMRTVGLAERA 
LQIMCERATQRIAFKKKLYAHEWAHWIAESRIAIEK, 
I RLLTLKAAHSMDTLGS AGAKKE I AMI KVAAPRAVSK 
I VDWAI QVCGGAGVSQDY PLANMYAITRVLRLADGPD 
EVHLSAIATMELRDQAKRLTAKI 


2519 


A 


2 


550 


FGVINLICTGFLLMWCSSrNblAJjl \ox i IJjI J.rLU_ir 
SLMTCLI S YWVTLRKPS PVYS FGFERIiEVLAVFASTV 
liAQLGALFILKE SAERFLEQPE IHTGRLLVGTFVALC 
FNLFTMLSIRNKPFAYVSEAASTSWLQEHVADLSRSL 
CGI I PGLSS I FLPRMNPFVLI DLAGAF ALC I TYML 


2520 


A 


1 


1876 


RAPMMTKAVPEEPRKPGRLTQALNSPLTWEHVWICVP 
GGTPDCLTDTFRVKRPHLRRSASNGHVPGTPVYREKE 
DMYDEIIELKKSLHVQKSDVDLMRTKLRRLEEENSRK 
DRQI EQLLD PSRGTDF VRTLAE KRPD AS WVINGLKQR 
I LKLEQQCKE KDGT I S KLQTDMKTTNLEEMR I AMET Y 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










YEEVHRLQTLLAS S ETTGKKPLGEKKTGAKRQKKMGS 
ALLSLSRSVQELTEENQSLKEDLDRVLSTSPTISKTQ 
GYVEWSKPRLLRRIVELEKKLSVMESSKSHAABPVKb 
HPPACLAS S S ALHRQPRGDRNKDHERLRGAVRDLKEE 
RTALQEQLLQRDLEVKQLLQAKADLEKELECAREGEE 
ERREREEVLREEIQTLTSKLQELQEMKKEEKEDCPEV 
PHKAQELPAPTPS SRHCEQDWPPDS SEEGLPRPRS PC 
S DGRRD AAARVLQ AQWKVYKHKKKKAVLDE AAVVLQA 
AFRGHLTRTKLLASKAHGSEPPSVPGLPDQSSPVPRV 
PSPIAQATGSPVQEEAIVIIQSALRAHLARARHSATG 
KRTTTAASTRRRS AS ATHGDAS S PPFLAALPDPS PSG 
PQAVAPLPGDDVNSDDSDDIVIAPSLPTKNFPV 


2521 


A 


5618 


4060 


APARRGLGDRCS S S S FSS S FFS SAS S PRRLATAAARA 
GGAAVI PVPEEPALPVPGGRGAGEAGPRRTQQVEPGV 
PGRAP PAHHAALCHLSRPQAKI LSMMEDNKQLALRI D 
GAVQSASQEVTNLRAELTATNRRIiAELSGGGGPGPGP 
GAAASASAAGDSAATNMENPQLGAQVLLREEVSRLQE 
E VHLLRQMKEMLAKDLEE S QG Vjtvb b c* V u oA l xi XjK v y jj 
AQKEQE LARAKEALQAMKADRKRLKGEKTDLVS QMQQ 
LYATLESREEQLRDFIRNYEQHRKESEDAVKALAKEK 

DLLEREKWELRRQAKhA 1 Drift. 1 AiiKoy jjijjj J-nj-hn xunxvo 
LEAELAMAKQSLATLTKDVPKRHSLAMPGETVLNGNQ 
EWWQADLPLTAAIRQSQQTLYHSHPPHPADRQAVRV 
SPCHSRQPSVISDASAAEGDRSSTPSDINSPRHRTHS 

PREHSGEC I S CSVLS FC KKRWMWGEKGMRPVCS LCPG 
G 


2522 


A 


1023 


766 


MLCSRLGTTASWRRLGIRAWAPLLLLFPWDWHFILSF 
S SRPWAGTLLAPHDVTMGS S TF PQS CQAEAGPRHAWP 
TGRF bKKliKK V w 


2523 


A 


1 


429 


NTLLTIIVLFPDPPSLSSNSSIRSSSSFSTCISCELS 
TSGCPAITTESVSASPSMISPSATSV*VTS*SSSCTS 
AS PGS PGSCWLLLES * EAPWASCSDLFLLEALLLPKR 
LLGWFTIRE S VS KGFRAALTVLAMLGLDRS KL 


2524 


A 


165 


638 


MFVIAFLSPLSLIFLAKr lKiJbK^i\^ij^u\c>ix«. 
LALNGVFTNT I KLI VGRPRPDFFYRCFPDGLAHSDLM 
CTGDKDWNEGRKS F PSGHS S FAFAGLAFAS FYIiAGK 
LHCFTPQGRGKSWKr t-/\r J-io f Li-ur t\i\v i/ujoki^u x rv. 
HHWQGPFKW* 


2525 


A 


165 


638 


MFVIAFLSPLSLIFLAKFLKKADTRDSRQACLAaSLA 
LALNGVFTNTI KL I VGRPRPDFF YRC F PDGLAHSDLM 
CTGDKDWNEGRKS FPSGHS S FAFAGLAFAS F YLAGK 
LHCFTPQGRGKSWRFCAFLSPLLFAAVIALSRTCDYK 

HHWQGPFKW* 


2526 


A 


2 


266 


KGSTEAF I SGTAGWGTGLL PS S AGLPGGWGPAGGWAG 
TDRRGPRARPI PQKSPPWPWSGDAAKGQSGFLPVAAW 
AGQGRLPGGGIIVH 


2527 


A 


2 


614 


PRVRLFTVITYFFWIGIAPIFILYELDSPLCWNEVF 
IGYGSALGSASFLTSFLGIWLFSYCMEDIHMAFIGIF 
TTMTGMAMTAFASTTLMMFLARVPFLFTIVPFSVLRS 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *==Stop codon, 
/^possible nucleotide deletion,=possible nucleotide 
insertion) 










MLSKVVRSTEQGTLFACIAFLETLGGVTAVSTFNGIY 
S ATVAWYPGFTFLLS AGLLLLPAI SLCWKCTS WNEG 
SYELLIQEESSEDASDRAC 


2528 


A 


2 


614 


PRVRLFTVITYFFWIGIAPIFILYELDSPLCWNEVF 
IGYGSALGSASFLTSFLGIWLFSYCMEDIHMAFIGIF 
TTMTGMAMTAFASTTLMMFLARVPFLFTIVPFSVLRS 
MLSKWRSTEQGTLFACIAFLETLGGVTAVSTFNGIY 
SATVAWYPGFTFLLSAGLLLLPAISLCWKCTSWNEG 
SYELLIQEESSEDASDRAC 


2529 


A 


1297 


793 


LGE PLGDLCELI PGDVQQLQMGEVHPGTGAQGSAAQS 

VAGEVQLTQLSHARQRPSCQGSQLIALDLQHMDISRQ 

PRWQHVQPVARQVQRAQQAQLAJKCaVAVUijWAv^ 

EVELLQEVGGGKVFAANACDLWQDHEGAHAARQATG 

HALQRVIVQVRRVQPLEAL*RVPSGLPRRVRAFMILH 

NQITGIGREDFATTYFLEELNLSYNRITSPQVHRDAF 

RKLRLLRSLDLSGNRLHMLPPGLPRNVHVLKVKRNEL 

AALARGALAGMAQLRELYLTSNRLRSRALGPRAWVDL 

AHLQLLDI AGNQLTE I PEGLPESLEYLYLQNNKI SAV 

PANAFDSTPNLKGI FLRFNKLAVGSWDSAFRRLKHL 

QVLDIEGNLEFGDISKDRGRLGKEKEEEEEDEVEEEE 

TR 


2530 


A 


2 


1671 


LADGDMLPLLLLPLLWGGSLQEKPVYELQVQKSVTVQ 
EGLCVLVPCSFS YPWRSWYSS PPLYVYWFRDGEI PYY 
AEWATNNPDRRVKPETQGRFRLLGDVQKKNCSLSIG 
DARMEDTGSYFFRVERGRDVKYSYQQNKLNLEVTALI 
EKPDIHFLEPLESGRPTRLSCSLPGSCEAGPPLTFSW 
TGNALS PLDPETTRS S ELTLTPRPEDHGTNLTCQMKR 

s*ki-i-rk. y>T niuiii'inmrmT \n7CVA T5ATTTT PDMHT IT .T? T T ./TNl' 1 ' 

QGAQVTTERTVQLNVb YAPy 1 1 1 J» r kwoxajjis j-jj^in x 
S YL PVLEGQ ALRLLCDAPSN P PAHLSWFQGS PALNAT 
PI SNTGILEIjRRVRS AEEGGFTCRAQHPLGSLQI FLN 
LSVYSLPQLLGPSCSWEAEGLHCRCSFRARPAPSLCW 
RLEEKPLEGNSSQGSFKVNSSSAGPWANSSLILHGGL 
SSDLKVSCKAWNIYGSQSGSVLLLQGRSNLGTGWPA 
ALGGAGVMALLCICLCLIFFLIVKARRKQAAGRPEKM 
DDEDPIMGTITSGSRKKPWPDSPGDQASPPGDAPPLE 
EQKELHYASLSFSEMKSREPKDQEAPSTTEYSEIKTS 
K 


2531 


A 


2 


1671 


LADGDMLPLLLLPLLWGGSLQE KPVYELQVQKS VTVQ 
EGLCVLVPCSFSYPWRSWYSSPPLYVYWFRDGEIPYY 
AEWATNNPDRRVKPETQGRFRLLGDVQKKNCSLSIG 
DARMEDTGSYFFRVERGRDVKYSYQQNKLNLEVTALI 
EKPDIHFLEPLESGRPTRLSCSLPGSCEAGPPLTFSW 
TGNALSPLDPETTRSSELTLTPRPEDHGTNLTCQMKR 
QGAQVTTERTVQLNVS YAPQTITI FRNGIALEILQNT 
S YL PVLEGQ ALRLLCDAPSN P PAHLSWFQGS PALNAT 
PI SNTGILELRRVRS AEEGGFTCRAQHPLGSLQI FLN 
LSVYSLPQLLGPSCSWEAEGLHCRCSFRARPAPSLCW 
RLEEKPLEGNSSQGSFKVNSSSAGPWANSSLILHGGL 
S SDLKVSCKAWNI YGSQSGS VLLLQGRSNLGTGWPA 
1 ALGGAGVMALLCICLCLIFFLIVKARRKQAAGRPEKM 
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TABLE 7 



^Seq 

ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
Grst amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion =possib!e nucleotide 
insertion) 










DDEDPIMGTITSGSRKKPWPDSPGDQASPPGDAPPLE 
EQKELHYASLSFSEMKSREPKDQEAPSTTEYSEIKTS 

K 


2532 


A | 


51 


674 


QQAEEHLAAYSVSDSDSGKDPSMECCRRATPCa I JjJjJjr 
LAFLLLSSRTARSEEDRDGLWDAWGPWSECSRTCGGG 
ASYSLRRCLSSKSCEGRNIRYRTCSNVDCPPEAGDFR 
AQQCSAHNDVKHHGQFYEWLPVSNDPDNPCSLKCQAK 
GTTLWE LAPKVLiDvj X K.L. x 1 lio JuiJjylt^ l bLj-Liuy v oi\du 
FSFNLSRGFQCLCVNGLHSLTL 


2533 


A 


239 


577 


GQ PAR VW S LDTMGTRLL P AJj b u VIjJj v Lbr a V y 1 f 
QQDEMPSPTFLTQVKESLSSYWESAKTAAQNLYEKTY 
LPAVDEKLRDLYS KSTAAMST YTGI FTDQVLS VLKGE 
E 


2534 


A 


239 


577 


GQPARVWSLDTMGTRLLPALFLVLLVLGFEVQGTQQP 
QQDEMPSPTFLTQVKESLSSYWESAKTAAQNLYEKTY 
LPAVDEKLRDLYS KSTAAMST YTGI FTDQVLS VLKGE 
E 


2535 


A 


103 


318 


MWRKHLSLLVLRDFLLAPRRRDSLTLTHMATLAQKPC 
GIEKQICFYVLFSLSIFQHRLNSLKPRHLLRPDP* 


2536 


A 


1 


2374 


MVS I SDLVI C P PRHPKVLGLQGP PGLDS I S DP SAGAG 
FLDWGEIGMPGPGRAGHQALCKCDCQCLEKTTTKAPG 
KMPKSTRSGPVRVRLADGPNRCAGRLECGMPDAGEQC 
VMTTGTSGRHCGLLGTGLWKGYTDLTI I PPGPGTPPQ 
ERTCQGDYHSGGTWTHS PLETTRRPGS S S PAI RRLPA 
QMLLLPARPPHPRSSSPEAMDPPPPKAPPFPKAEGPS 
STPSSAAGPRPPRLGRHLLIDAN/ GVYPYTYTVQLEE 
E PRGP PQRE AP PGE PGPRKGYS CPECARVFAS PLRLQ 
SHRVSHSDLKPFTCGACGKAFKRSSHLSRHRATHRAR 
AGPPHTCPLCPRRFQDAAELAQHSWGTPRGPLLAAAC 
NCE VARGRLES PGPERLLHGYGGREEEGGWGRAAGGL 
DRVEGF I S S KAHHYLLI DTQGVPYTVLVTRSHRGSQG 
PVGLQARKVLQLPRVLKGLRVHVHLQRHb 1 1 Hb£. V f y 
DFAGSLDS FQTPGE S LRL VFRALDTTQS SRI S KAE PC 
LKEEPLSLGDLPYMHTTLCFCRKRRASPGPGTLQRGA 
LAWPDWAS PRALPVPSLS STTRS PAAPLFAVPLSGRT 
TQAMAFDGI I FQGQSQRSAGLTTTSRFLACQRPLRLC 
AWWASRS PRCTLRRPVGLRPGVHPRPRLVYRDLKPEN 
VMASGQPRDRPQPWFAWPPRPTRFCGGCWTLTPKEER 
CDRHQGAPGAPWRQREGEAEAVGAVEERLGSiih.AF^rU 
AEREAAHPRP PRPTAFGVS SGLPELLVKRWAQLQEL 
WTSSTAGGWSTOMOT 


2537 


A 


241 


957 


MRSSLTMVGTLWAFLSLVTAVTSSTSYFLPYWLFGSQ 
MGKPVS FSTFRRCNYPVRGEGHSLIMVEECGRYAS FN 
AI PSLAWQMCT WTGAGCALLLLVALAAVLGCCMEEL 
I SRMMGRCMGAAQFVGGLLI S SGCALYPLGWNS PEIM 
QTCGNVSNQ FQLGTCRLGWAY YCAGGGAAAAMLI CTW 
LSCFAGRNPKPVILGGKHHEENHFLCYGAWPLPSTLE 
LRKEDRGGRATGKQVTP 


2538 


A 


2817 


1352 


MAAAAAGAGSGPWAAQEKQFPPALLSFFIYNPRFGPR 
EGQEENKILFYHPNEVEKNEKIRNVGLCEAIVQFTRT 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possible nucleotide 
insertion) 










FS PS KPAKS LHTQKNRQFFNE PEENFWMVMVVRNP I 1 
EKQSKDGKPVIEYQEEELLDKVYSSVLRQCYSMYKLF 
ngtflkamedggvkllkerle kf fhrylqtlihijqb ljj 
LLDIFGGISFFPLDKMTYLKIQSFIN\RMEESIjNIVK 
YTAFLYNDQLIWSGLEQDDMR11jxK.iJjJ. IbljJk* PKril \ 

EPELAGRDS pi raempgnlqhygrfltgplnlndpda 
kcrfpkifvntddtyeelhli \vykamsaavcfmida 
svhptlgf\crrtgtaslgpqlhsgwasghlveqf*h 
qqggcsgv* gkepqfkfi yfnhmnlaekstvhmrktp 
svsltsvhpdlmkilgdinsdftrvdedeeiivkams 

D YWVVGKKS DRRc*Jj i V 1 JjIM y xSJM AIM LiJL Cj V JN CiCj V JVJVJjV^ri 

TQFNNIFFLD 


2539 


A 


171 


347 


NYSLSVYLVRQLTAGTLIjQKLRAKGIRNPDHSRALSE 
*HLSSLPHLIWIQVFLALQPS 


2540 


A 


2 


583 


FPGRRFRHNARRGFFFSHIGWLFVRKHRDVIEKGRKL 
DVTDLLADPVVRIQRKYYKISVVLMCFVVPTLVPWYI 
WGESLWNSYFLASILRYTISLNISWLVNSAAHMYGNR 
PYDKHISPRQNPLVALGAIGEGFHNYHHTFPFDYSAS 
EFGLNFNPTTWF I DFMCWLGLATDRKRATKPMI EARK 
ARTGDSSA 


2541 


A 


1 


1791 


MTSGPQTSQPKEHIiTNFKSDEQERVSSLAQSHTDNHR 
LHEPGLQEGIRAVPREDPQWNYQADS PRGPLDHHRRR 
ASGNSQWRQAKLI ALTRALTLAKGLRINI YTDS KYAF 
RI LHHHAVI WAERGFLPTQGS S I INATLI KTLLKAAL 
LPKEAGVIHCKGHQKASDPITQGNAYADKPIGFGLEK 
LLTFHLSQLQEYRGTKWREKSHRKVNHDENTSKLTSL 
NEE YTKNKTE YEEAQDAI VKE I VN I S SGYVE PMQTLN 
DVIiAQLDAWS FAHV SNCsAP VPiv Kir Al Jjii jxajU uki j. 
LKASRHACVEVQDEI AFI PNDVYFEKDKQMFHI ITGP 
NMGGKSTYIRQTGVIVLMAQIGCFVPCESAEVSIVDC 
I LARVGAGDSQLKGV STFMAEMLETAS I LRSATKDS L 
1 1 IDELGRGTSTYDGFGLAWAISEYI ATKIGAFCMFA 
THFHELTALiANQ I PTVNNLHVTALTTEETLTMLYQVK 
KGVCDQSFGIHVAKLiAWr Pl^VlaLAJNWJN^UJiiiJEiiirvj 
YIGESQGYDIMEPAAKKCYLEREQGEKIIQEFLSKVK 
QMPFTEMSEENITIKLKQLKAEVIAKNNSFVNEIISR 
IKVTT 


2542 


A 


1 


639 


AGTARFVCQAEGl PS PKMS WLiKNvaKKi HbWbKi rwiivt 
S KLVTNQ 1 1 PEDDAI YQCMAENSQGS I LSRARLTWM 
SEDRPSAPYNVHAETMSSSAILLAWERPLYNSDKVIA 
YSVHYMKAEGLNNEEYQWIGNDTTHYI IDDLEPASN 
YTFYIVAYMPMGASQMSDHVTQNTLEDGHTSVGLLQF 
AGGLLLTLVASVFPVPGDTTSEGCVTAK 


2543 


A 


700 


283 


VPRLVS PLSNPAPKFYCVS FFYHMYGKHIGSLNLLVR 
SRNKGALDTHAWSLSGNKGNVWQQAHVPI SPSGPFQI 
IFEGVRGPGYLGDIAIDDVTLKKGECPRKQTDPNKW 
VMPGSGAPCQS S PQLWGPMAI FLLALQR 


2544 


A 


2 


673 


NSRVEGQLCDLDPSAHFYGHCGEQLECRLDTGGDLSR 
GEVPEPLCACRSQSPLCGSDGHTYSQICRLQEAARAR 
PDANLTVAHPGPCESGPQIVSHPYDTWNVTGQDVIFG 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/^possible nucleotide deletion,=possib!e nucleotide 
insertion) 










CE VFAYPMAS I E WRKDGLD I QLPGDDPH I S VQFRGGP 
QRFEVTGWLQIQAVRPSDEGTYRCLiARNALGQVEAPA 
SLTVLTPDQLNSTGI PQLRSLNLVPEEEAESEENDDY 
Y 


2545 


A 


195 


635 


IATMETKDQKKQRKKNSGPKAAKKKKRHLQDLQLGDE 
EDAWKRNPICAFAFQSAVWMARSFHRTQDLKTKKHHI P 
WDRTPLEPPPIWWMGP/PKVGKSTLIQCIjIRNFT 
RQKLTE I RGPVMI VSGKKLRLTI I DCGCDINMM I DLA 


2546 


A 


167 


691 


MGWVWTLCTAS AGLTLLFWSQTPGKAFQI PCPP PHLS 
HWCLSPMQMDDGCARLCVLWTAWMRWRVLMCSCRVWA 
TDLGI FLGVALGNE PLEMWPLTQNE ECTVTGFLRDKLj 
QYRSRLQYMKHYFPINYKIRVPYEGVFRIANVTRLRA 
QGSERELRYLGVLVSLSATESVHDELL 


2547 


A 


1 


337 


RRF VS QETGNL Y I AKVE KSDVGNYTC VVTNTVTNHKV 
LGPPTPLI LRNDGVMGE YE PKI EVQFPETVPTAKGAT 
VKLECFALGNPVPTI IWRRADGKPI ARKARRHKSRVG 
K 


2548 


A 


2 


462 


EFQEAAKLYHTNYVRNSRAI GVLWAI FTICFAI VNW 
CFIQPYWIGDGVDTPQAGYFGLFHYCIGNGFSRELTC 
RGSFTDFSTLPSGAFKAASFFIGLSMMLIIACIICFT 
LFFFCNTATVYKI CAWMQLTS AACLVLGCMI FPDGWD 
SDEVN 


2549 


A 


418 


768 


AFTKHLLKPRMEVKDCGAHNLEKGLTI FFHKGPSSMY 
FRLCGPHEGRFFFIA I PPLHLLHLLFPLHFF YNFRDE 
ELSCTWELKYTGNASALLILPDQDKMEEVEAMLLPE 
TFALCC 


2550 


A 


2484 


121 


AIMTTRQATKDPLLRGVSPTPSKIPVRSQKRTPFPTV 
TSCAVDQENQDPRRWVQKPPLNIQRPLVDSAGPRPKA 
RHQAETSQRLVGISQPRNPLEELRPSPRGQNVGPGPP 
AQTEAPGTI EFVADPAALATI LSGEGVKSCHLGRQPS 
LAKRVLVRGSQGGTTQRVQGVRASAYLAPRTPTHRLD 
PARASCFSRLEGPGPRGRTLCPQRLQALISPSGPSFH 
PSTRPSFQELRRETAGSSRTSVSQASGIiLLETPVQPA 
FSLPKGEREWTHSDEGGVASLGLAQRVPLRENREMS 
HTRDSHDSHLMPS PAPVAQPLPGHWPCPS PFGRAQR 
VPSPGPPTLTSYSVLRRLTVQPKTRFTPMPSTPRVQQ 
AQWLRGVS PQS C S ED PALPWEQVAVRLFDQESC IRSL 
EGSGKPPVATPSGPHSNRTPSLQEVKIQRIGILQQLL 
RQEVEGLVGGQCVPLNGGS SLDMVELQPLLTEI SRTL 
NATEHNSGTSHLPGLLKHSGLPKPCLPEECGEPQPCP 
P AE PGP PE AFCRSE PE I PE PS LQEQLE VPE P Y PPAE P 
RPLESCCRSEPEIPESSRQEQLEVPEPCPPAEPRPLE 
SYCRIEPEIPESSRQEQLEVPEPCPPAEPGPLQPSTQ 
GQSGPPGPCPR\VELGASEPCTLEHRSLEPSLPP\CC 
SQWAPATTSLIFSSQ\HPLCASPPICSFQS\LRPPA\ 
GQAG/LSANLAPLEPLALKGAAFKSC\LTAIHCFHEA 
SSWTIECAF\YTSRAPP\SGPTRVCTNPVATLLEWQD 
ALCF I PVGSAAPQGS P 


2551 


A 


356 


1313 


NCNLSVGSSCLSIiASVWLARRMWTLRSPLTRSLYVNM 
TSGPGG PAAAAGGRKENHQ WYVCNREICLCE S LQAVFV 
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TABLE 7 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X-Unknown, *=Stop codon, 
/-possible nucleotide deletion,=possible nucleotide 
insertion) 










QSYLDQGTQIFLNNSIEKSGWLFIQLYHSFVSSVFSL 
FMSRTS INGLLGRGSMFVFS PDQFQRLLKINPDWKTH 
RLLDLGAGDGEVTKIMSPHFEEIYATELSETMIWQLQ 
KKKYRVLGI NEWQNTGFQYDVI SCLNLLDRCDQPLTL 
LKDIRSVLEPTRGRVILALVLPFHPYVENVGGKWEKP 
S E I LE I KGQNWE EQVNSLPE VFRKAGF VI E AFTRLPY 
LCEGDMYNDYYVLDDAVFVLKPV 


2552 


A 


299 


21 


MGSSVLSIWILSPSIYPILSPLAMPCLSRTDLIRVRR 
I QGAWPS EGTAS S IRGWVLTKLRMS SGKALE AL YC I P 
GAAQHPGLGVTRVWSGRT* 


2553 


A 


337 


642 


FAFPHYYI KPYHLKRIHRAVLRGNLEKLKYLLLTYYD 
ANKRDRKERTALHLACATGQPEMVHLLVSRRCELNLC 
DREDRT PLI KAVQLRQEACATLLLQNGA 


2554 


B 


111 


1520 


PS I PAAVPQ S APPE PHREETVTATATSQVAQQ P PAAA 
APGEQAVAGPAPSTVPSSTSKDRPVSQPSLVGSKEEP 
PPARSGSGGGSAKEPQEERSQQQDDIEELETKAVGMS 
NDGRFLKFDIEIGRGSFKTVYKGLDTETTVEVAWCEL 
QDRKLTKSERQRFKEEAEMLKGLQHPNIVRFYDSWES 
TVKGKKCI VLVTELMTSGTLKTYLKRFKVMKI KVLRS 
WCRQILKGLQFLHTRTPPIIHRDLKCDNIFITGPTGS 
VKI GDLGLATLKRAS FAKS VI GTPE FMAPEMYEE KYD 
E S VDVYAFGMCMLEMATSE YP YSECQNAAQI YRRVTS 
GVKPAS FDKVAI PEVKEI I EGC I RQNKDERYS I KDLL 
NHAFFQEETGVRVELAEEDDGEKIAIKLWLRIEDIKK 
LKGKYKDNEAIEFSFDLERNVPEDVAQEMVESGYVCE 
GDHKTMAKAI KDRVSLI KRKREQRQL* 


2555 


B 


111 


1520 


PSI PAAVPQSAPPEPHREETVTATATSQVAQQPPAAA 
APGEQAVAGPAPSTVPSSTSKDRPVSQPSLVGSKEEP 
PPARSGSGGGSAKEPQEERSQQQDDIEELETKAVGMS 
NDGRFLKFD I E IGRGS FKTVYKGLDTETTVEVAWCEL 
QDRKLTKSERQRFKEEAEMLKGLQHPNIVRFYDSWES 
TVKGKKCI VLVTELMTSGTLKTYLKRFKVMKI KVLRS 
WCRQILKGLQFLHTRTPPIIHRDLKCDNIFITGPTGS 
VKI GDLGLATLKRAS FAKSVIGTPEFMAPEMYEEKYD 
ESVDVYAFGMCMLEMATSEYP YSECQNAAQI YRRVTS 
GVKPAS FDKVAI PEVKEI I EGC I RQNKDERYS I KDLL 
NHAFFQEETGVRVELAEEDDGEKIAIKLWLRIEDIKK 
LKGKYKDNE AI E FS FDLERNVPEDVAQEMVE SG YVCE 
GDHKTMAKAI KDRVSLI KRKREQRQL* 


2556 


A 


105 


447 


LI FCRVFEYLHSLHLPQEICLSLALFSRFTFCVI ICE 
VDVWSVI FKVPFCSKRNKVAVHTMLYIQI FVSLFI * P 
QNWKQPKCPATVERINKMWYIHIV/EYYSANKR 


2557 


A 


1 


512 


DEELPDLSVSRRSSHLHWGI PVPGYDSQTI YVWLDAL 
VNYLTVIGYPNAEFKSWWPATSHIIGKDILKFHAIYW 
PAFLLGAGMS PPQRI CVHSHWTVCGQKMSKSLGNWD 
PRTCLNRYTVDGFRYFLLRQGVPNWDCDYYDEKWKL 
LNSELADALGGLLNRCTAKRIN 


2558 


A 


1117 


647 


MI LQVS GGPWTVALTALLMVLLI S WQSRATPENS VY 
QERQECYAFNGTQRWDGLI YNREE YVHFDS AVGEFL 
AVMELGRPIGEYFNSQKDFMERKRAEVDKVCRHKYEL 
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SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid 

residue of 

peptide 

sequence 


Amino acid sequence (X=Unknown, *=Stop codon, 
/=possible nucleotide deletion,=possible nucleotide 
insertion) 










"MEPIjIRQRRGDVTITAVRGCWTTILSGYFLLKRGWS 
GGCSWGSS* 


2559 


A 


1027 


254 


STQRGGIKGVARAASLVGRRRAGTGMALLLCLVCLTA 
ALAHGCLHCHSNFSKKFSFYRHHVNPKSWWVGDIPVS 
GALLTDWSDDTMKELHLAI PAKITREKLDQVATAVYQ 
MMDQLYQGKMYFPGYFPNELRNIFREQVHLIQNAIIE 
SRIDCQHRCGIFQYETISCNNCTDSHVACFGYNCESS 
AQWKSAVQGLLNYINNWHKQDTSMRPRSSAFSWPGTH 
RAAPAFLVLPALRCLEPPHLANLSLEDAA*CLKQH 


2560 


A 


1027 


254 


STQRGGIKGVARAASLVGRRRAGTGMALLLCLVCLTA 
ALAHGCLHCHSNFSKKFSFYRHHVNFKSWWVGDI PVS 
GALLTDWSDDTMKELHLAI PAKITREKLDQVATAVYQ 
MMDQLYQGKMYFPGYFPNELRNIFREQVHLIQNAI IE 
SRIDCQHRCGIFQYETISCNNCTDSHVACFGYNCESS 
AQWKSAVQGLLNYINNWHKQDTSMRPRSSAFSWPGTH 
RAAPAFLVLPALRCLEPPHLANLSLEDAA*CLKQH 


2561 


A 


88 


459 


AGDHVSRNI PVATNNPVRAVQEETRDRFHLLGDPQNK 
DCTLS I RDTRE S DAGTYVF CVERGNMKWNYKYDQLSV 
NVTASQDLLSRYRLEVPESVTVQEGLCVSVPCSVLYP 

HYNWTAS S PVYGS 


2562 


A 


337 


1129 


AHLSARLSALILDEVAILPAPQNLSVLSTNMKHLLMW 
SPVIAPGETVYYSVEYQGEYESLYTSHIWIPSSWCSL 
TEGPECDVTDDITATVPYNLRVRATLGSQTS / CLEHP 
/VSIPLIETQPSLPDL/RMEITKDGFHLVIELEDLGP 
OFEFLVAYWRREPGAEEHVKMVRSGGIPVHLETMEPG 
AAYC VKAQT FVKAI GR YS AF SQTEC VE VQGE AI PLVL 
ALFAFVGFMLI LWVPLFVWKMGRLLQ/ YLLLPRGGS 
SQTPWKITQF 


2563 


A 


1 


359 


" I SGES I YWS QKPTPS SNAS PWSE PAAVDVELTAYALL 
AQLTKPSLTQKEIAKATSIVAWLAKQRNAYGGFSSTQ 
DTWALQALAKYATTAYVPSEEINLWKSTENFQRTF 

NIQAVNRM 


2564 


A 


150 


299 


MTFL I LS I APVLAVTGMI ETAAMTGFANKDKQELKHA 
1 GKQLKLWRIYVL* 
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696 
TABLE 8 



SEQIDNO: 


Number of TM 


TM range: scores 


695 


1 


174-193:1980 


696 


1 


49-73:2788 


704 


1 


168-185:1769 


711 


1 


A AOQ iliTAil.OAl 1 

4488-4504:291 1 


722 


6 


272-290:2864 32o-33 1:1 /23 ooj- 




oo0:234o 1 1UZ-1 128.3 103 Viol' 






1153:1708 1161-1180:2038 


731 


1 


Af\£ AO A .10/1 C i 

406-434:2243 


732 


I 


579-607:2245 


736 


1 


364-380:1936 


740 


1 


302-321:2224 


742 


1 


816-832:1758 


756 ] 


1 j 


1012-1028:1967 


757 


I 


529-548:3334 


758 


1 


533-552:3334 


759 


4 


1014-1033:2221 1095-1113:2566 




1171-1194:2506 1245-1265:2246 


761 


3 


65-83:2205 117-136:2143 853- 




870:2248 


773 


3 


73-88:2787 168-186:2328 340- 




360:2085 


776 


3 


90-106:2479 212-232:2302 Jo/- 




403:2183 


781 


1 


1 15-132:1854 


784 


1 


53-69:2130 


795 


3 


433-453:1894 506-531:1812 606- 




622:2130 


798 


1 


176-192:2849 


804 


1 


231-248:3490 j 


825 


I 


80-99:2954 


826 


1 


194-213:2954 


835 


4 


r\A i t f\ nine 1 vl C 1 4? 1 . 1 flflC OA1 

94-110:2105 145-161 :1993 2U3- 




223:2483 j6o-3o3:1o33 


836 


5 


A/t i i A.i 1 ac 1 /l < 1 £1 -7787 7fl7- 
94-HU:21U3 143-101.2Z5Z ZU/- 




77£«1717 A77 447*1 RIO *i1Q- 
220. 1/12 42 lolU J 1^- 






C"t7.7/£B7 
jj / .ZOoZ 


838 


1 


33U-34 /.3343 


839 


1 


oo inn oi^n 

88-109:2169 


842 


1 


14y-l /j.l /3l 


843 


1 


149-175:1731 i 


846 


1 


inn o 1 ^.n^i 

300-316:1761 j 


851 


1 


383-405:2659 


852 


1 


379-401:2659 


860 


1 


61-81:3175 _ 


866 


2 


62-81:1837 131-147:2134 


871 


1 


50-68*2276 


877 


3 


155-173:2724 426-442:2801 780- 




800:2540 


883 


3 


192-214:1749 266-284:1879 425- 




444:2199 


889 


2 


183-205:2141 304-320:2692 


897 


1 


538-553:1709 


898 


1 


725-740:1709 


899 


I 


58-73:1930 


901 


1 


102-121:2779 
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697 
TABLE 8 



SEQ ID NO: 


Number of TM 


TM range: scores 


905 


1 


208-225:3345 


906 


1 


116-133:2747 


926 


3 


266-286:2107 431-450:2017 494- 




509:2005 


927 


1 


307-329:2730 


930 


9 


204-221:1978 259-275:1735 


939 


1 


88-116:1861 


950 


3 


343-368:2429 440-456:2054 498- 




513:2344 


951 


1 


676-696:2381 


952 


1 


79-95:2605 1 


955 


1 


178-196:2063 


958 


1 


394-414:2626 


964 


1 


735-758:3292 


968 


1 


84-99:2458 


969 


4 


59-75:2180 119-134:2458 415- 




433:2785 501-522:2904 


970 


I 


267-284:3132 


975 


3 


192-208:2437 279-296:1885 392- 




409:2589 


976 


3 


266-282:2437 353-370:1885 466- 




483:2589 


992 


1 


1065-1083:1762 


993 


1 


124-141:2188 


996 


1 


450-474:2798 


1003 


1 


313-334:2372 


1018 


5 


71-95:2393 145-166:2340 187- 






204:1848 237-256:3231 297- 






318:1783 


1023 


1 


239-257:2651 


1024 


1 


377-395:1757 


1025 


1 


339-357:1757 


1032 


3 


192-214:1749 266-284:1879 425- 






444:2199 


1039 


2 


152-168:2052 244-259:1761 


1042 


3 


110-124:2032 198-214:1804 512- 






531:2204 


1050 


2 


460-476:2094 570-590:2709 


1055 


1 


306-332:2732 


1062 


2 


82-97:2605 165-182:2300 


1071 


5 


84-100:2101 214-230:2609 380- 






395:2074456-478:1922 536- 






553:1999 


1085 


2 


40-69:2283 99-120:1980 


1094 


4 


93-108:2432 170-187:2464 205- 






220:2179 241-265:2052 


1098 


2 


142-158:1937 197-216:2428 


1099 




550-567:3380 


1110 




105-127:2966 


1117 




225-240:1816473-494:3219 


1118 




234-255:3219 


1130 




1245-1266:3138 


1143 




80-99:2954 


1144 




194-213:2954 


1146 




233-249:2778 
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698 
TABLE 8 



SEQ ID NO: 


Number of TM 


TM ran$e: scores 


1169 


1 


39-oo:20y/ 


1180 




77-100:1932 


1194 


1 


105-121:2609 


1195 


1 


OZT 1 A/1 . 1 Ol C 

86-104:1835 _i 


1197 




202-221:2/01 


1213 


I 


692-715:1/01 ■ 


1223 T 


1 


347-363:2829 


1234 


1 


555-570:1891 


1237 


1 


518-537:2980 i 


1240 




676-696:2930 


1245 




89-105:1701 156-172:2335 


1247 


* 


856-879:3766 


1249 


1 


211-237:3134 


1251 


2 


82-99:2126 203-219:2134 


1252 


2 


75-92:2355 196-212:2053 


1264 


3 


189-206:2466 247-266:1853 321- 




336:1839 _ 


1265 


1 


580-604:2i/03 


1266 


1 


coa ^a/I.iaai 

580-604:2903 


1274 


1 


c/r •"ta.iiao. 

56-70:2193 


1275 


1 


719-739:2381 


1279 


1 


ICC 1 TC.OC 1 1 

155-1 /5:Z5l 1 


1284 


3 


89-105:174o 155-1 /3.Z43J 33U- 




3oo:zi/o 


1289 


I 


471-489:2039 


1290 


1 


toe OIO.I A/11 

195-212:1943 


1292 


1 


241-2o3:Zo/o 


1293 


1 


241-2o3:2o/o 


1306 


1 


610-625:2249 


1310 


1 


AA1 OOl.iflAO ' 

201-221: lyOo 


1313 


1 


AA1 O 1 *7.*">/i AiC 

201-217:2490 


1315 


1 


CO AO 

59-/5.2149 


1316 


1 


59-/5.Z149 


i3iy 


A 

*T 


200-217:2717 258-273:1781 295- 




318:2028 416-436:2373 


1322 


1 


356-381:1996 


1330 


2 


86-104:2471 167-190:2177 


1337 


1 


194-209:1865 


1341 


2 


144-165:2452 216-235:1700 


1349 


2 


102-117:3056 174-195:2254 


1363 


1 


435-452:2888 


1364 


1 


235-254:3185 


1368 


1 


114-134:1898 
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TABLE 9 



bEQ ID NO: of 

foil 1^~~4-U 

iull-lengtn 
nucleotide 
sequence 


SEQ ID NO: of 
full-length 
peptide 
sequence 


SEQ ID NO: of 
contig nucleotide 
sequence 


SEQ ID NO: of 
contig peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
I¥o. SEQ ID NO.) * 


1 


685 


1 1£Q 

uoy 


t G£9 

iyo / 


to/1 rtCitif 

784 954b 


2 


686 


1170 
id /\j 


1 OAS 

iyoo 


HQA AC/1/: 

784 9546 


3 


687 


1171 


1 OAQ 

iyoy 


784 9546 


4 


688 


1179 
ID /Z 


1 G90 

ly/u 


'TO/I t\CA/T 

784 9546 


5 


689 


1171 


ly 1 I 


787 7048 


6 


690 


1174 
ID /f 


1 OT5 

iy /z 


TO yf Tlill 

784 2242 


7 


691 


117S 
I J ID 


1 Q91 

ly Id 


TO/I £f\f\C 

784 6005 


3 


60? 


117A 


1 OT/1 

iy/4 


too ocm 

788 2591 


9 


691 








10 


694 


1177 
ID i I 


1 0*7 K 

ly/j 


TOA OvlO'*! 

789 2432 


11 










12 










13 


697 


1 178 


1 Q7£ 

iy /o 


/»4 3765 


14 


698 


1 170 


1 0*77 

ly I / 


TO/1 CCACk 

784 6649 


15 


699 








16 


700 


1180 


1 Q78 
ly /o 


/o4 O/OO 


17 


701 

/ \J L 


1181 

lJO Jl 


1 070 

iy /y 


TO/! Af\Cf\ 

/o4 4050 


18 


702 


1189 

1O0Z 




TOT 1AOZT1 

7o7 10261 


19 


703 


1181 




TOT /CA10 

/o/ 6018 


20 


704 

/ VT 


1184 
i jo*f 


1 ooi 

lysz 


784 6424 


21 


705 


118 c 




TOT 1A1A1 

7o7 10201 


22 


706 


1 186 
IjOU 


1 GQ/i 

iyo4 


nor T/CQQ 

/OJ ZOoo 


23 


707 


1187 

1 JO / 


iyo j 


TO/I ^^A 

/o4 420 


24 


708 


1188 


1 Q8£ 

iyoo 


noA cian 
/o4 M3U 


25 


709 


1180 


1 GQ7 

iyo / 


TQA 1 1 An 

/oy nuy 


26 


710 


1100 


1 Q88 

iyoo 


TQ/1 C1/I1 

/o4 M41 


27 


711 


1101 


108Q 
Lyoy 


no A TO 1/1 

/o4 ZZ14 


28 


712 


1392 


1 000 
iyy\J 


704 111/1 
/ OH-__ZZ 14 


29 


713 


1101 

uyj 


1001 
lyy l 


904 STX 
/o4 DlZj 


30 


714 


1304 


1009 


784 907/S 

/OH ZU/O 


31 


715 


11Q C 
iDyD 


1001 

iyyj 


784 909£ 
/54 ZU/O 


32 


716 


1106 


1 0Q4 
lyyH 


904 4190 
/o4 41Zo 


33 


717 


1107 
iDy i 


1 QQ C 
iy yD 


759 940Q 
/ O / Z4Uy 


34 


718 


1108 

1J70 


1006 
i yy\j 


784 1919 
/54 jZjZ 


35 


719 

/ ty 


110Q 
vDyy 


1 QQ7 

tyy / i 


TQ/l 1 A11 O 

/o4 lUZio 


36 


720 


1400 


1 008 

iyyo 


TOT 1A/T1 

/o/ zyoi 


37 


721 


1401 


1 000 

iyyy 


/04 1ZD4 


38 


722 


1409 


9OO0 
ZUUU 


HQ A CQ1 

/o4 OoJ 


39 


723 


1401 


9O01 
ZUU 1 


noA QAC^C 


40 


724 


1404 


9009 
ZUUZ 


TO/I 1TQ/! 

/o4 Jzo4 


41 


725 


1405 


2003 


784 5767 


42 


726 


1406 


2004 


784 1548 


43 


727 


1407 


2005 


784 3819 


44 


728 


1408 


2006 


784 582 


45 


729 


1409 


2007 


784J390 


46 


730 


1410 


2008 


784 4142 


47 


731 


1411 


2009 


785 3653 


48 


732 


1412 


2010 


785 3653 


49 


733 


1413 


2011 


785 3653 


50 


734 


1414 


2012 


785 3653 
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TABLE 9 



SEQ ID NO: of 
full-length 
nucleotide 
sequence 


SEO in Nn» nf 

fuJl-lenpfh 

***** * V- LX r HI 

peptide 
sequence 


otL\£ JLU i\\Jl 01 

CUIlllg UUCltUllUc 


ot,Kl JLU INU: Of 

conug peptide 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 

INO. dfi/V/ LmJ ls\J») 


51 


735 


1415 


2013 


784 ^1 


52 


736 








53 


737 


1416 


2014 


HQ A 1f\QO 

/oh juyz 


54 


738 


1417 


2015 


784 *3R9 


55 


739 








56 


740 


1418 


2016 


707 1 MO 
/o/ IJ^O 


57 


741 


1419 


2017 


78^ 996 


58 


742 


1420 


2018 


19.6. 91^9 


59 


743 


1421 


2019 


704 4779 
/ 0*+ H 1 1 A 


60 


744 


1422 


2020 




61 


745 


1423 


2021 


787 QAQ1 


62 


746 


1424 


2022 


787 QfiQ1 


63 


747 


1425 


2023 


7Q9 14fi 


64 


748 


1426 


2024 


784 8498 


65 


749 


1427 


2025 


78Q 1799 » 


66 


750 








67 


751 


1428 


2026 


784 767 


68 


752 


1429 


2027 


784 4697 


69 


753 


1430 


2028 


785 197 


70 


754 


1431 


2029 


784 1601 


71 


755 


1432 


2030 


792 7466 


72 


756 


1433 


2031 


787 ^014 


73 


757 


1434 


2032 


784 160S 


74 


758 


1435 


2033 


784 1605 


75 


759 


1436 


2034 


784 6460 


76 


760 


1437 


2035 


784 1606 


77 


761 


1438 


2036 


784 1723 


78 


762 


1439 


2037 


785 1480 


79 


763 


1440 


2038 


784 9631 


80 


764 


1441 


2039 


784 5962 


81 


765 


1442 


2040 


784 5962 


82 


766 


1443 


2041 


784 5962 


83 


767 


1444 


2042 


784 7108 


84 


768 


1445 


2043 


784 2392 


85 


769 


1446 


2044 


784 4227 


86 


770 


1447 


2045 


784 7743 


87 


771 


1448 


2046 


784 561 


88 


772 


1449 


2047 


790 491 


89 


773 


1450 


2048 


789 fHOQ 


90 


774 


1451 


2049 


787 954^ 


91 


775 


1452 


2050 


784 3892 


92 


776 


1453 


2051 


787_3685 


93 


777 


1454 


2052 


784 8321 


94 


778 


1455 


2053 


784 7951 


95 i 


779 


1456 


2054 


784 4225 


96 


780 


1457 


2055 


784 7169 


97 


781 


1458 


2056 


784 5044 


98 


782 


1459 


2057 


784 5670 ! 


99 


783 


1460 


2058 


784 2357 


100 


784 


1461 | 2059 


784 6637 I 
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TABLE 9 



SEQ ID NO: of 
iull-lengtn 
nucleotide 
sequence 


SEQ ID NO: of 
full-length 
peptide 
sequence 


SEQ ID NO: of 
contig nucleotide 
sequence 


SEQ ID NO: of 
contig peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 


101 


78S 


1 AA1 

14oZ 


2060 


784 3755 


102 


786 


1 A £.1 

1403 


2061 


784 9196 


1 l/J 


/ o / 










788 


1 AAA 

1404 


2062 


784 706 


ins 


78Q 


14oj 


2063 


784 706 


ioa 


700 
/yy) 








107 


7Q1 
ly 1 








1 08 


707 
/yZ 


1466 


2064 


784 4289 


mo 


701 


1 A C~t 

1467 


2065 


784 7228 


1 10 


7Q4 


14oo 


2066 


784 3033 


1 1 1 

ill 


70^ 
lyj 


1 a cn 

1469 


2067 


784 6065 


11? 

1 LA 


/y\j 


i Am 
147U 


2068 


785_2882 


113 


707 
fy i 


1/1*71 

14/1 


2069 


785 2882 


1 14 

1 1*T 


7Q8 


1 /IT) 

14/z 


2070 


785 2882 


115 

1 LJ 


7Q0 


1 /1*71 
14 ID 


2071 


784 7266 


116 


son 


\ A1A 

14/4 


2072 


784 7453 


117 

11/ 


801 

OVJ i 


1 /I *7^ 
14/!) 


2073 


784_7453 


118 

no 


809 
ouz 


1 /!*7/C 

147o 


2074 


788 13662 


119 


801 








120 


804 


1/1*7*7 
14// 


2075 


784 2527 


121 


80S 


1/1*70 

l4/o 


2076 


784 2968 


122 


806 

OUU 


1 470 
14/y 


1AT7 

2077 


785 3195 


123 


807 

Ov f 


1480 
14oU 


2078 


785 3195 i 


124 


808 

OvO 


^ aq i 
145 1 


2079 


785 3195 


125 


809 
o \jy 


1487 
14QZ 


O AOA 

zOoO 


790 14016 


126 


810 


14oj 


zOol 


790 21053 


127 


811 

Oil 


1484 
1454 


1 AOO 

2082 


787 9817 


128 


812 


148^ 
14oj 


7AQ9 


784 4047 


129 


8n 


148/% 
1450 


ZU84 


784 4047 


130 


814 

Q It 


14o / 


1 AO C 

zUo5 


784 4047 


131 


81 S 

O l J 


1 AQQ 

l4oo 


2086 


787 9324 


132 


816 


1480 
145V 


zUo/ 


785 3086 


133 


817 


1 4GO 


1 AOO 

2088 


785 3086 


134 


818 1 

O lO 


i4y i 


OAO A 

2089 


784 7345. 


135 


810 

o i.y 


i4yz 


OAAA 

2090 


784 8313 


136 


890 


1 /1Q1 

i4yj 


OAA1 

2091 


787_71 


137 


891 


1 /1Q/1 

14^4 


2092 


784 5644 


138 


899 
ozz 


1 AQS. 

i4yo 


2093 


790 16836 


\jy 


891 
oZj 




2094 


784J7226 


140 


89/1 


1 /f Q*7 

I4y/ 


2095 


784 1134 1 


141 


825 


1498 


2096 


784 7001 
/ o*+ / UU 1 


142 


826 


1499 


2097 


784 7001 


143 


827 


1500 


2098 


788_3086 


144 


828 


1501 


2099 


787 1984 


145 


829 


1502 


2100 


784 3145 


146 


830 


1503 


2101 


784 3145 


147 


831 


1504 


2102 


784 1806 


148 


832 


1505 


2103 


784 1806 


149 


833 


1506 


2104 


788 594 


150 


834 


1507 


2105 


784 3693 
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TABLE 9 



sfo in iv n« nf 

fllll-lpnoth 

nuclenfifJp 

uuvicu IIUC 

sequence 


oli/Vj 111 INlJ: 01 

fllll—lortfif-l-* 

luii-icngin 

pcpuuc 


SEQ ID INU: of 
contig nucleotide 
sequence 


SEQ ID NO: of 
contig peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 

In|0. &K4\l lU iNlvJ.) 


151 


835 


1508 


2106 


7Q^ 511 


152 


836 


1509 


9107 


7Q^ <:ii 

/SO JJl 


153 


837 


1510 


9108 


/oh /4Uo 


154 


838 


1511 


2100 


7C7 5051 

/o/ djo i 


155 


839 


1512 


2110 


700 617 
/yu ooz 


156 


840 


1513 


211 1 


lyL DhiO 


157 


841 


1514 


2112 


7J3C T217 
/ oO I J I / 


158 


842 


1515 


2113 


HQ A QtVlA 
/OH OOJH 


159 


843 


1516 


91 14 

Z. 1 1H 


HQA QtZIA 
/oh O0->H 


160 


844 


1517 


91 IS 


HQ A AQ 10 
/OH nolo 


161 


845 


1518 


91 16 


7RA /lfilR 
/OH HOlO 


162 


846 


1519 


91 17 


70c 7QO 


163 


847 


1520 


91 18 

Alio 


/OH lOOH 


164 


848 


1521 


21 19 


784 1 814 
/oh 1 0 JH 


165 


849 


1522 


2120 


7524 one 

/ OH Z?J 


166 


850 


1523 


9191 


787 9011 


167 


851 


1524 


2122 


784 9671 

/ OH- ZO / J 


168 


852 


1525 


2123 


784 9671 

/ OH / J 


169 


853 


1526 


2124 


784 9671 

/ OH ZO / J 


170 


854 


1527 


2125 


784 1944 

/ OH jZhh 


171 


855 


1528 


2126 


784 0676 
/oh yo/O 


172 


856 


1529 


2127 


784 7451 
/oh /hjo 


173 


857 


1530 


9198 


784 9Q1Q 

/oh zyjy 


174 


858 


1531 


2129 


784 9Q1Q 
/ oh £,yjy 


175 


859 


1532 


2130 


787 9049 

/ O / X,UHZ 


176 


860 


1533 


9111 


787 9049 
/ O / ZUhZ 


177 


861 


1534 


2132 


784 1017 

/OH J\JD / 


178 


862 


1535 


2133 


787 8909 


179 


863 


1536 


2134 


784 7561 


180 


864 








181 


865 


1537 


2135 


792 7045 


182 


866 


1538 


2136 


790 1109 


183 


867 


1539, 


2137 


784 4481 

/ OH *rtOJ 


184 


868 


1540 


2138 

X# i JO 


784 4481 

/ OH HHO J 


185 


869 


1541 


2139 


787 2061 


186 


870 


1542 


2140 

x» ihv 


784 S081 

/ OH J I/O j 


187 


871 








188 


872 


1543 


2141 


7pc 571 
/ O J j / 1 


189 


873 


1544 


2149 


784 9517 

/OH Zjl/ 


190 


874 








191 


875 


1545 


2143 


784 2138 


192 


876 


1546 


2144 


784„9072 


193 


877 


1547 


2145 


787 9212 


194 


878 


1548 


2146 


784 5182 


195 


879 


1549 


2147 


784 5182 


196 


880 


1550 


2148 


784 5182 


197 


881 


1551 


2149 


788J1145 


198 


882 


1552 


2150 


785 3208 


199 


883 


1553 


2151 


785 2364 


200 


884 


1554 


2152 


787 6120 
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^170 ID NO* nf 


OllfV^ Ul 11 Kjm 01 


CPA IT* 1SJf\. nf 

&ej\i ill iNii: oi 


oej\1 hi inu: oi 


Identification of 


fnll-lpn&th 

I Uil ACilg 111 


full-lpnoth 


contig nucleotide 


contig peptide 


Priority Application 


niirlprttiriG 

UUUbU UU 


JJLJJ UUC 


sequence 


sequence 


tnat contig nucleotide 


sequence 


seauence 






sequence was filed 










\Aiiurney uocKei 












201 


885 








202 


886 


1555 


2153 


785 2555 


203 


887 


1556 


2154 


785 2555 


204 


888 


1557 


2155 


/ OO JUaO 


205 


889 


1558 


2156 


785 2199 


206 


890 


1559 


2157 


785 316 


207 


891 


1560 


2158 


784 8768 


208 


892 


1561 


2159 




209 


893 


1562 


2160 


785 1574 


210 


894 


1563 


2161 


787 791 


211 


895 


1564 


2162 


784. 1 779 

/0*t VAIL 


212 


896 


1565 


2163 


784 1158 

/Ot lJJO 


213 


897 


1566 


2164 


787 4447 

to/ tit / 


214 


898 


1567 


2165 


787 4447 


215 


899 


1568 


2166 


784 4287 


216 


900 


1569 


2167 


784 7705 


217 


901 


1570 


2168 


784 1214 

/ 0*T LZt JL*T 


218 


902 


1571 


2169 


784 3287 


219 


903 


1572 


2170 


784 3287 


220 


904 


1573 


2171 


784 3950 


221 


905 


1574 


2172 


787 5951 


222 


906 


1575 


2173 


788 8994 


223 


907 


1576 


2174 


784 7897 


224 


908 


1577 


2175 


784 952 ! 


225 


909 


1578 


2176 


784 952 


226 


910 


1579 


2177 


784 952 


227 


911 








228 


912 


1580 


2178 


788 6394 

1 OO VJ7t 


229 


913 


1581 


2179 


784 6391 

/ 0*T V/_J ^ 1 


230 


914 


1582 


2180 


784 7670 


231 


915 


1583 


2181 


784 4795 


232 


916 


1584 


2182 


784 3004 


233 


917 


1585 


2183 


784 3004 

/ 0*T JuUt 


234 


918 


1586 


2184 


784 3004 


235 


919 


1587 


2185 


790 1148 


236 


920 


1588 


2186 


784 7696 


237 


921 


1589 


2187 


787 7957 


238 


922 


1590 


2188 


787 7957 

(O/ / / 


239 


923 


1591 


2189 


787 7957 


240 


924 


1592 


2190 


787 7957 

tot i jrj 1 


241 


925 


1593 


2191 


787 7957 


242 


926 


1594 


2192 


784 4718 ! 


243 


927 


1595 


2193 


785 3642 


244 


928 


1596 


2194 


787 6699 


245 


929 


1597 


2195 


784 6067 


246 


930 








247 


931 


1598 


2196 


784 8379 


248 


932 








249 


933 


1599 


2197 


784^6418 


250 


934 
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TABLE 9 
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oilfV£ ID ri\Jl 01 
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Identification of 


full-lenpth 




coniig nucicoiiue 


lunug peptide 


Priority Application 


nucleotide 


p c p title 


sequence 




that contig nucleotide 


sequence 


seQUence 






sequence vviib iueu 










( Affnrnov ri f\i*\ra^ 










Mo SEO ID NO ^ * 


251 


935 








252 


936 


1600 


2198 


784 3080 


253 


937 


1601 


2199 


792 3539 


254 


938 


1602 


2200 


784 4948 


255 


939 


1603 


2201 


787 4342 


256 


940 


1604 


2202 


784 7815 


257 


941 


1605 


2203 


784 5767 


258 


942 


1606 


2204 


784 5767 


259 


943 


1607 


2205 


784 5777 


260 


944 


1608 


2206 


784 5777 

/ OT Jill 


261 


945 


1609 


2207 


784 5777 

/ 0*T Jill 


262 


946 


1610 


2208 


784 ^777 


263 


947 


1611 


2209 


784 4849 


264 


948 








265 


949 


1612 


2210 


787 6059 


266 


950 








267 


951 


1613 


2211 


784 3590 


268 


952 


1614 


2212 


784 337 . 


269 


953 


1615 


2213 


790 27506 


270 


954 


1616 


2214 


784 6469 


271 


955 


1617 


2215 


787 8139 


272 


956 


1618 


2216 


784 3189 


273 


957 


1619 


2217 


784 1459 


274 


958 


1620 


2218 


790 11947 


275 


959 


1621 


2219 


784 4007 


276 


960 


1622 


2220 


784 4007 


277 


961 


1623 


2221 


784 4007 


278 


962 


1624 


2222 


784 4007 

9 W I ~ W 9 


279 


963 








280 


964 


1625 


2223 


784 1398 


281 


965 


1626 


2224 


785 2523 


282 


966 








283 


967 


1627 


2225 


784 10126 


284 


968 


1628 


2226 


785 3232 


285 


969 


1629 


2227 


785 3232 


286 


970 


1630 


2228 


784 9436 


287 


971 


1631 


2229 


784 6743 


288 


972 


1632 


2230 


789 4182 


289 


973 


1633 


2231 


784 8857 


290 


974 


1634 


2232 


784 1226 


291 


975 


1635 


2233 


787 2898 


292 


976 


1636 


2234 


787_2898 


293 


977 


1637 


2235 


784 3743 


294 


978 


1638 


2236 


790J713 


295 


979 


1639 


2237 


790J713 


296 


980 








297 


981 


1640 


2238 


787 371 


298 


982 


1641 


2239 


784 10083 


299 


983 








300 


984 


1642 


2240 


787 1611 
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TABLE 9 



oej\2 hj iiu: oi 


ohi\£ IX) INO: of 


SEQ ID NO: of 


ofc/Q ID NO: of 


Identification of 




villi mnrr^-li 

iuu-iengtn 


contig nucleotide 


contig peptide 


Priority Application 




pepiiue 


sequence 


sequence 


that contig nucleotide 










sequence was tiled 










(Attorney Docket 










InJO. SH/^ 111 NU.) 


301 


985 


1643 


2241 


787 161 1 


302 


986 


1644 


2242 


784 77^ 

/ OH / / JJ 


303 


987 








304 


988 








305 


989 


1645 


2243 


784 964 


306 


990 


1646 


2244 


784 Q71Q 
/oh yfoy 


307 


991 


1647 


2245 


784 6V?^ 
/ oh O JZJ 


308 


992 


1648 


2246 


784 4695 
/ OH HOZ J 


309 


993 


1649 


2247 


787 8000 
/ o / oyyy 


310 


994 


1650 


2248 


787 9^86 

lOi ZrjOO 


311 


995 


1651 


9940 


784 A1A1 
1 oH_H /H.5 


312 


996 


1652 


2250 


784 6^*5 

/ OH OjjJ 


313 


997 


1653 


2251 


784 8945 


314 


998 


1654 


2252 


784 4654 


315 


999 


1655 


2253 


784 ^551 
/ o*t JJJ i 


316 


1000 


1656 


2254 


784 5897 
/ o*t J / 


317 


1001 


1657 


2255 


784 4Q84 

/ OH H;/OH 


318 


1002 


1658 


22S6 


784 4Q84 

/ OH H^OH 


319 


1003 


1659 


2257 


784 T145 i 

/OH J 1HJ 


320 


1004 


1660 


2258 


784 8058 

/ OH OV/JO 


321 


1005 


1661 1 


2259 


784 1657 

/ OH 0 O 0 1 


322 


1006 


1662 


9960 


785 1101 

/ OJ I 17 I 


323 


1007 


1663 


2261 


784 5580 

/OH_J JOI/ 


324 


1008 


1664 


2262 


784 6981 

/ OH UZO 1 


325 


1009 


1665 


2263 


784 9185 

/ o*t Z> lOJ 


326 


1010 


1666 


2264 


787 407 

/ O / *T7 / 


327 


1011 


1667 


2265 


784 4047 


328 


1012 


1668 


2266 


784 8772 

/ 0*T O / / it 


329 


1013 


1669 


2267 


791 ^817 


330 


1014 


1670 


2268 


791 ^817 

/ x 1 Jul/ 


331 


1015 


1671 


2269 


784 8115 1 


332 


1016 


1672 


2270 


784 3141 


333 


1017 


1673 


2271 


784 3141 

/ 0*T J i*tl 


334 


1018 


1674 


2272 


787 1645 


335 


1019 


1675 


2273 


785 256 i 


336 


1020 


1676 


2274 


784 1733 


337 


1021 


1677 


2275 


784 1858 

/ 0*T 1 O J o 


338 


1022 


1678 


2276 


784 1858 

/OH lOJO 


339 


1023 


1679 


2277 


700 5161 


340 


1024 


1680 


9978 


785 1 no 


341 


1025 


1681 


2279 


785 102 i 


342 


1026 


1682 


2280 


787 4041 


343 


1027 


1683 


2281 


792_3856 


344 


1028 


1684 


2282 


787 3012 


345 


1029 


1685 


2283 


787_3012 


346 


1030 


1686 


2284 


784 1108 


347 


1031 


1687 


2285 


785 435 


348 


1032 


1688 


2286 


785 2364 


349 


1033 


1689 


2287 


784 2969 


350 


1034 


1690 


2288 


784 7604 
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TABLE 9 
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full-length 
nucleotide 
sequence 
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full-lenpfh 

neotide 

sequence 


osLKl ID INUJ 01 
coniig nucieouue 
sequence 


oUa^ ID in(j: of 
conng peptide 
sequence 


Identification of 
Fnonty Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 

1NO. ID f^U.J 


351 


1035 








352 


1036 


1691 


2289 


787 ^016 

/Of J\J lO 


353 


1037 


1692 


2290 




354 


1038 


1693 


2291 


7Q0 9601 
i y\j zou j 


355 


1039 


1694 


2292 


787 6QQQ 


356 


1040 


1695 


2293 


784, 159£ 

/ OH- JJZO 


357 


1041 


1696 


2294 


784 6114 
/ o*t 010*+ 


358 


1042 


1697 


2295 


784 5095 


359 


1043 


1698 


2296 


784 9 1 1 Q 
/o*+ ziiy 


360 


1044 


1699 


2297 


787 9789 
/o/ Z/oZ 


361 


1045 


1700 


2298 


7524 10971 
/ ot 1 l/Z / 1 


362 


1046 


1701 


2299 


785 9701 

/ OD Z /Ul 


363 


1047 


1702 


2300 


784 QRQ9 


364 


1048 


1703 


2301 


785 1616 


365 


1049 








366 


1050 


1704 


2302 


785 166 


367 


1051 


1705 


2303 


784 8058 


368 


1052 








369 


1053 


1706 


2304 


789 1756 

/ O^r 1 / JU 


370 


1054 


1707 


2305 


787 10016 


371 


1055 


1708 


2306 


784 8181 


372 


1056 


1709 


2307 


787 4467 

/ O / *t*tU / 


373 


1057 


1710 


2308 


787 4467 


374 


1058 


1711 


2309 


787 4467 


375 


1059 


1712 


2310 


787 4467 

/ O /_*t*tU / 


376 


1060 


1713 


2311 


784 8914 

/ Of OZJ*T 


377 


1061 


1714 


2312 


784 470 

/ 0*T *T / V 


378 


1062 


1715 


2313 


784 8940 


379 


1063 








380 


1064 


1716 


2314 


784 9166 


381 


1065 


1717 


2315 


784 7964 


382 


1066 


1718 


2316 


790 91118 


383 


1067 


1719 


2317 


784 6659 


384 


1068 


1720 


2318 


784 8964 


385 


1069 


1721 


2319 


787 2108 


386 


1070 


1722 


2320 


784 4485 

f Ot t*TOJ 


387 


1071 


1723 


2321 


784 4689 

/ Ot tU07 


388 


1072 


1724 


2322 T 


785 1448 


389 


1073 


1725 


2323 


785 1150 


390 


1074 


1726 


2324 


784 4498 


391 


1075 


1727 


2325 


787 5857 


392 


1076 


1728 


2326 


784 8283 


393 


1077 


1729 


2327 


784 8283 


394 


1078 


1730 


2328 


784 1601 


395 


1079 


1731 


2329 


784 1601 


396 


1080 


1732 


2330 


784 1601 


397 


1081 


1733 


2331 


784 1601 


398 


1082 


1734 


2332 


784 1601 


399 


1083 


1735 


2333 


785 3693 


400 


1084 


1736 ! 


2334 


788 8918 
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TABLE 9 



SEO ID NO« of 




WO m MO» nf 


1X3 INU: 01 


Identification of 


full-length 


full-lenarti 

1U11 IvUlf 111 


pnntio Tinrlpofinp 
vUUUg uuwcuuuc 


couug pepuae 


Priority Application 


nucleotide 


Denticle 






tn at contig nucieouae 


sequence 


sequence 






con ii An pa was fl laH 
aci|UcliLc Was lllcU 










( AttAiriAv rini*Irpt 










No- SEOIDNCU * 


401 


1085 


1737 


2335 


787 757 


402 


1086 


1738 


2336 


784 1907 


403 


1087 


1739 


2337 


784 10178 


404 


1088 


1740 


2338 


784 10178 


405 


1089 


1741 


2339 


784 8535 


406 


1090 


1742 


2340 


784 8535 


407 


1091 


1743 


2341 


784 8535 


408 


1092 


1744 


2342 


784 8301 


409 


1093 


1745 


2343 


784 8301 


410 


1094 


1746 


2344 


787 10129 


411 


1095 








412 


1096 


1747 


2345 


787 4498 


413 


1097 


1748 


2346 


787 4498 


414 


1098 


1749 


2347 


790 27173 


415 


1099 


1750 


2348 


787 4500 


416 


1100 


1751 


2349 


785 3699 


417 


1101 


1752 


2350 


784 952 


418 


1102 


1753 


2351 


784 952 


419 


1103 


1754 


2352 


787 1871 


420 


1104 


1755 


2353 


784 1835 


421 


1105 


1756 


2354 


785 2845 


422 


1106 


1757 


2355 


784 9214 


423 


1107 


1758 


2356 


784_2232 


424 


1108 


1759 


2357 


784 2232 


425 


1109 


1760 


2358 


792_6149 


426 


1110 








427 


1111 


1761 


2359 


784 6702 


428 


1112 


1762 


2360 


784_8354 


429 


1113 








430 


1114 








431 


1115 


1763 


2361 


787_9215 


432 


1116 








433 


1117 


1764 


2362 


785 2878 


434 


1118 


1765 


2363 


785 2878 


435 


1119 


1766 


2364 


784 10026 


436 


1120 


1767 


2365 


784 6265 


437 


1121 


1768 


2366 


785 2731 


438 


1122 


1769 


2367 


787 6236 


439 


1123 


1770 


2368 


785 1252 


440 


1124 








441 


1125 








442 


1126 


1771 


2369 


791 3415 


443 


1127 


1772 


2370 


785 3334 


444 


1128 


1773 


2371 


784 8215 


445 


1129 


1774 


2372 


784 10074 


446 


1130 


1775 


2373 


784 10074 


447 


1131 


1776 


2374 


784 3863 


448 


1132 








449 


1133 


1777 


2375 


784 2811 


450 


1134 


1778 


2376 


790 28311 
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SEO ID NO- of 


seo rn no- Af 


ci?n m no* nf 
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VUUUg UUvlCUHUt 

^ pun pii cp 


contip nentide 
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fhst pnnHo mirfpfttf flfi 
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seauence wa? filed 
( Attorriev Docket 
No. SEO ID NO.) * 


45 i 


1135 


1779 


2377 


784 4221 


452 


1136 


1780 


2378 


785 1480 


453 


1137 


1781 


2379 


784 2520 


454 


1138 


1782 


2380 


784 1312 


455 


1139 


1783 


2381 


784_633 


456 


1140 


1784 


2382 


785_590 


457 


1141 


1785 


2383 


785 590 


458 


1142 


1786 


2384 


790 12519 


459 


1143 


1787 


2385 


784 7001 


460 


1144 


1788 


2386 


784 7001 


461 


1145 


1789 


2387 


788 5657 


462 


1146 


1790 


2388 


784 4745 


463 


1147 


1791 


2389 


787 6106 


464 


1148 


1792 


2390 


787 2727 


465 


1149 


1793 


2391 


784 3950 


466 


1150 


1794 


2392 


790 10584 


467 


1151 


1795 


2393 


784 2612 


468 


1152 


1796 


2394 


787 2965 


469 


1153 


1797 


2395 


787_2965 


470 


1154 


1798 


2396 


787 8641 


471 


1155 


1799 


2397 


785 3774 


472 


1156 








473 


1157 


1800 


2398 


784 8542 


474 


1158 


1801 


2399 


784^8542 


475 


1159 








476 


1160 


1802 


2400 


790 13566 


477 


1161 


1803 


2401 


785 410 


478 


1162 








479 


1163 


1804 


2402 


784 5054 1 


480 


1164 








481 


1165 


1805 


2403 


785 3036 


482 


1166 


1806 


2404 


789 4683 


483 


1167 








484 


1168 


1807 


2405 


784 6816 


485 


1169 


1808 


2406 


784 5981 


486 


1170 


1809 


2407 


785 3078 


487 


1171 


1810 


2408 


784^2586 


488 


1172 


1811 


2409 


784 6539 


489 


1173 


1812 


2410 


784 6539 


490 


1174 


1813 


2411 


784 6539 


491 


1175 


1814 


2412 


784 8016 


492 


1176 


1815 


2413 


787 10370 


493 


1177 


1816 


2414 


784 5450 


494 


1178 


1817 


2415 


787 7533 


495 


1179 


1818 


2416 


785 3119 


496 


1180 


1819 


2417 


785 3120 


497 


1181 


1820 


2418 


785 3122 


498 


1182 


1821 


2419 


784 9756 


499 


1183 


1822 


2420 


784 4843 


500 


1184 


1823 


2421 


784 441 
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wo in no* of 

full-length 
■melon tide 
seauence 


SEO m NO- of 

JUL/ llvi Ul 

fulUlpnpth 

lull IVUglll 

neutide 
sequence 


SEO ID NO: of 
contig nucleotide 
sequence 


SEQ ID NO: of 
contig peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEO ID NO.) * 


501 


1185 


1824 


2422 


784 1095 


502 


1186 


1825 


2423 


784 1066 


503 


1187 


1826 


2424 


785 206 


504 


1188 


1827 


2425 


784 4128 


505 


1189 


1828 


2426 


784 4128 J 


506 


1190 


1829 


2427 


784 4128 


507 


1191 


1830 


2428 


790 27336 


508 


1192 








509 


1193 


1831 


2429 


784 2678 


510 


1194 


1832 


2430 


784 3456 


511 


1195 








512 


1196 


1833 


2431 


785 582 


513 


1197 








514 


1198 


1834 


2432 


789 4888 


515 


1199 


1835 


2433 


789 4172 


516 


1200 


1836 


2434 


784 9397 


517 


1201 








518 


1202 


1837 


2435 


784 1307 


519 


1203 


1838 


2436 


789 5903 


520 


1204 


1839 


2437 


784 9886 


521 


1205 


1840 


2438 


784 2293 


522 


1206 


1841 


2439 


784 5604 


523 


1207 


1842 


2440 


784 7569 


524 


1208 








525 


1209 


1843 


2441 


784_9399 


526 


1210 


1844 


2442 


784 5253 


527 


1211 


1845 


2443 


784 8932 


528 


1212 


1846 


2444 


784 7850 


529 


1213 


1847 


2445 


787 10375 


530 


1214 


1848 


2446 


792 2784 


531 


1215 


1849 


2447 


784 2550 


532 


1216 


1850 


2448 


784 3066 


533 


1217 


1851 


2449 


785 2240 


534 


1218 


1852 


2450 


785 76 


535 


1219 


1853 


2451 


792 6297 


536 


1220 








537 


1221 


1854 


2452 


792 J062 


538 


1222 


1855 


2453 


784 9474 


539 


1223 








540 


1224 








541 


1225 


1856 


2454 


784 3898 


542 


1226 


1857 


2455 


784 4445 


543 


1227 


1858 


2456 


784 9615 


544 


1228 


1859 


2457 


784 10126 


545 


1229 


1860 


2458 


784 9880 


546 


1230 








547 


1231 


1861 


2459 


785 3774 


548 


1232 


1862 


2460 


785 3774 


549 


1233 


1863 


2461 


785 3774 


550 


1234 


1864 


2462 


784 1315 
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sequence 
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fAttornev Docket 










No. SEO ID NO.) * 


551 


1235 








552 


1236 


1865 


2463 


790 16605 


553 


1237 


1866 


2464 


784 2311 


554 


1238 


1867 


2465 


787 8252 


555 


1239 


1868 


2466 


784 5605 


556 


1240 


1869 


2467 


784 3824 


557 


1241 








558 


1242 


1870 


2468 


785 3563 


559 


1243 


1871 


2469 


790 20271 


560 


1244 








561 


1245 








562 


1246 


1872 


2470 


790_5164 


563 


1247 


1873 


2471 


785 3680 


564 


1248 


1874 


2472 


784 2988 


565 


1249 


1875 


2473 


787 4774 


566 


1250 








567 


1251 


1876 


2474 


784 9364 


568 


1252 


1877 


2475 


784_9364 


569 


1253 


1878 


2476 


784 8765 


570 


1254 








. 571 


1255 


1879 


2477 


790_12841 


572 


1256 


1880 


2478 


787 4398 


573 


1257 


1881 


2479 


787 4398 


574 


1258 








575 


1259 








576 


1260 


1882 


2480 


788 12600 


577 


1261 


1883 


2481 


790 16405 


578 


1262 


1884 


2482 


787 7025 


579 


1263 








580 


1264 


1885 


2483 


784 4168 


581 


1265 


1886 


2484 


790 26483 


582 


1266 


1887 


2485 


790 26483 


583 


1267 








584 


1268 


1888 


2486 


790 2440 


585 


1269 








586 


1270 


1889 


2487 


784 1755 


587 


1271 








588 


1272 


1890 


2488 


790 21097 


589 


1273 








590 


1274 


1891 


2489 


787 4393 


591 


1275 


1892 


2490 


784 3590 


592 


1276 


1893 


2491 


787 933 


593 


1277 


1894 


2492 


790 8149 


594 


1278 








595 


1279 


1895 


2493 


787 6126 


596 


1280 


1896 


2494 


785 3201 


597 


1281 


1897 


2495 


784J360 


598 ! 


1282 


1898 


2496 


784 360 


599 


1283 


1899 


2497 


784 360 


600 


1284 


1900 


2498 


784 270 1 
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601 


1285 


1901 


2499 


784 5003 


602 


1286 


1902 


2500 


784 6919 


603 


1287 


1903 


2501 


790 27941 


604 


1288 


1904 


2502 


790 19516 


605 


1289 


1905 


2503 


785 1001 


606 


1290 








607 


1291 


1906 


2504 


784 1320 


608 


1292 


1907 


2505 


785 3606 


609 


1293 


1908 


2506 


785 3606 


610 


1294 


1909 


2507 


784 8851 


611 


1295 








612 


1296 


1910 


2508 


792 4796 


613 


1297 


1911 


2509 


787 1962 | 


614 


1298 


1912 


2510 


787 1962 


615 


1299 








616 


1300 


1913 


2511 


791 4419 


617 


1301 


1914 


2512 


784 287 


618 


1302 


1915 


2513 


784_287 


619 


1303 








620 


1304 


1916 


2514 


784 4933 


621 


1305 


1917 


2515 


784__4933 


622 


1306 








623 


1307 


1918 


2516 


784 1318 


624 


1308 


1919 


2517 


784 3284 


625 


1309 


1920 


2518 


784 3284 


626 


1310 


1921 


2519 


784 915 


627 


1311 


1922 


2520 


784_7261 


628 


1312 


1923 


2521 


784 5106 


629 


1313 


1924 


2522 


785 598 


630 


1314 


1925 


2523 


787 4996 


631 


1315 


1926 


2524 


785 1259 


632 


1316 


1927 


2525 


785 1259 


633 


1317 


1928 


2526 


792 4498 


634 


1318 








635 


1319 


1929 


2527 


784 4291 


636 


1320 


1930 


2528 


784_4291 


637 


1321 


1931 


2529 


784_7003 


638 


1322 


1932 


2530 


784 7701 


639 


1323 


1933 


2531 


784 7701 


640 


1324 


1934 


2532 


784 2330 


641 


1325 


1935 


2533 


789 6254 


642 


1326 


1936 


2534 


789 6254 


643 


1327 


1937 


2535 


785_2282 


644 


1328 


1938 


2536 


790_23335 


645 


1329 








646 


1330 


1939 


2537 


785_2954 


647 


1331 








648 


1332 








649 


1333 








650 


1334 
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651 


1335 


1940 


2538 


784 3290 


652 


1336 


1941 


2539 


784 1408 


653 


1337 


1942 


2540 


784 5274 


654 


1338 








655 


1339 








656 


1340 








657 


1341 


1943 


2541 


790 26963 


658 


1342 








659 


1343 


1944 


2542 


787 2980 


660 


1344 


1945 


2543 


784 4818 


661 


1345 


1946 


2544 


784 5145 


662 


1346 


1947 


2545 


784 9169 


663 


1347 


1948 


2546 


785 1586 


664 


1348 


1949 


2547 


784 1600 


665 


1349 


1950 


2548 


784 9629 


666 


1350 


1951 


2549 


784 9248 


667 


1351 


1952 


2550 


787 7062 


668 


1352 


1953 


2551 


784 7286 


669 


1353 








670 


1354 


1954 


2552 


785_254 : 


671 


1355 


1955 


2553 


784 8867 


672 


1356 


1956 


2554 


784J7020 1 


673 


1357 


1957 


2555 


784 7020 


674 


1358 


1958 


2556 


788 1533 


675 


1359 


1959 


2557 


787„2028 


676 


1360 


1960 


2558 


785 2715 


677 


1361 


1961 


2559 


784 6946 


678 


1362 


1962 


2560 


784 6946 


679 


1363 


1963 


2561 


784 935 


680 


1364 


1964 


2562 


784J103 


681 


1365 








682 


1366 


1965 


2563 


784 1601 


683 


1367 


1966 


2564 


785_122 


684 


1368 









784_XXX = SEQ ID NO: XXX of Attorney Docket No. 784, US Serial No. 09/488,725 filed 01/21/2000, the 
entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is 
the parent application of a continuation-in-part application bearing Attorney Docket No. 784CIP, US 
Application Serial No. 09/552,317, filed April 25, 2000, which in turn is a parent application of continuation- 
in-part application bearing Attorney Docket No. 784CIP3A/PCT, PCX Serial No. PCT/US0O/35O17 filed 
December 22, 2000, both of which are incorporated herein by reference in their entirety, including Tables, and 
Sequence Listing. 

785JOCX = SEQ ID NO: XXX of Attorney Docket No. 785, US Serial No. 09/491,404 filed 01/25/2000, the 
entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is 
the parent application of a continuation-in-part application bearing Attorney Docket No. 785CIP3/PCT, PCT 
Serial No. PCT/US0 1/02623 filed January 25, 2001, which is incorporated herein by reference in its entirety, 
including Tables, and Sequence Listing. 
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787__XXX « SEQ ID NO: XXX of Attorney Docket No. 787, US Serial No. 09/496,914 filed 02/03/2000, the 
entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is 
the parent application of a continuation-in-part application bearing Attorney Docket No. 787CIP, US 
Application Serial No. 09/560,875, filed April 27, 2000, which in turn is a parent application of continuation- 
in-part application bearing Attorney Docket No. 787CIP3/PCT, PCT Serial No. PCT/US0 1/03 800 filed 
February 5, 2001, both of which are incorporated herein by reference in their entirety, including Tables, and 
Sequence Listing. 

788_XXX = SEQ ID NO: XXX of Attorney Docket No. 788, US Serial No. 09/5 15,126 filed 02/28/2000, the 
entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is 
the parent application of a continuation-in-part application bearing Attorney Docket No. 788CIP, US 
Application Serial No. 09/577,409, filed May 18, 2000, which in turn is a parent application of continuation-in- 
part application bearing Attorney Docket No. 788CIP3/PCT, PCT Serial No. PCT/US0 1/04927 filed February 
26, 2001, both of which are incorporated herein by reference in their entirety, including Tables, and Sequence 
Listing. 

789_XXX = SEQ ID NO: XXX of Attorney Docket No. 789, US Serial No. 09/519,705 filed 03/07/2000, the 
entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is 
the parent application of a continuation-in-part application bearing Attorney Docket No. 789CIP, US 
Application Serial No, 09/574,454, filed May 19, 2000, which in turn is a parent application of continuation-in- 
part application bearing Attorney Docket No. 789CIP3/PCT, PCT Serial No, PCT/US0 1/04941 filed March 5, 
2001, both of which are incorporated herein by reference in their entirety, including Tables, and Sequence 
Listing. 

790JOCX = SEQ ID NO: XXX of Attorney Docket No. 790, US Serial No. 09/540,217 filed 03/31/2000, the 
entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is 
the parent application of a continuation-in-part application bearing Attorney Docket No. 790CIP, US 
Application Serial No. 09/649,167, filed August 23, 2000, which in turn is a parent application of continuation- 
in-part application bearing Attorney Docket No. 790CIP3/PCT, PCT Serial No. PCT/US0 1/08631 filed March 
30, 2001, both of which are incorporated herein by reference in their entirety, including Tables, and Sequence 
Listing. 

791_XXX = SEQ ID NO: XXX of Attorney Docket No. 791, US Serial No. 09/552,929 filed 04/18/2000, the 
entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is 
the parent application of a continuation-in-part application bearing Attorney Docket No. 791CIP, US 
Application Serial No. 09/770,160, filed January 26, 2001, which in turn is a parent application of 
continuation-in-part application bearing Attorney Docket No. 791CIP3/PCT, PCT Serial No. PCT/US01/8656 
filed Aprill8, 2001, both of which are incorporated herein by reference in their entirety, including Tables, and 
Sequence Listing. 

792JCXX - SEQ ID NO: XXX of Attorney Docket No. 792, US Serial No. 09/577,408 filed 05/18/2000, the 
entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is 
the parent application of a continuation-in-part application bearing 792CIP3/PCT, PCT Serial No. 
PCT/US0 1/14827 filed May 16, 2001, which is incorporated herein by reference in its entirety, including 
Tables, and Sequence Listing. 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-684. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein 
said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein 
said polynucleotide has greater than about 99% sequence identity with the polynucleotide of 
claim L 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting 
of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; 
and 

(b) a polypeptide encoded by a polynucleotide hybridizing under 
stringent conditions with any one of SEQ ID NO: 1-684. 
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11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 1 0. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 



complex with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 



15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 



contacting the sample with a compound that binds to and forms a 




detecting said product and thereby the polynucleotide of claim 1 in the 



amplifying a product comprising at least a portion of the 



17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 
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a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex 
is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, 
under conditions sufficient to form a polypeptide/compound complex, wherein the complex 
drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, 
so that if the polypeptide/compound complex is detected, a compound that binds to the 
polypeptide of claim 10 is identified. 

19. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of any of the polynucleotides from SEQ ID NO: 1-684, under 
conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides SEQ ED NO: 685-1368. 

21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising of at least one of 
SEQ ID NO: 1-684. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 
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25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the* collection. 

26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 



